Disaster Robotics: Satoshi Tadokoro Editor

Springer Tracts in Advanced Robotics 128
Satoshi Tadokoro Editor
Disaster
Robotics
Results from the ImPACT Tough Robotics
Challenge
Springer Tracts in Advanced Robotics 128
Series editors
Prof. Bruno Siciliano Prof. Oussama Khatib
Dipartimento di Ingegneria Elettrica Artificial Intelligence Laboratory
e Tecnologie dell’Informazione Department of Computer Science
Università degli Studi di Napoli Stanford University
Federico II Stanford, CA 94305-9010
Via Claudio 21, 80125 Napoli USA
Italy E-mail: khatib@cs.stanford.edu
E-mail: siciliano@unina.it
Editorial Advisory Board
Nancy Amato, Texas A&M University, USA

Oliver Brock, TU Berlin, Germany
Herman Bruyninckx, KU Leuven, Belgium
Wolfram Burgard, University Freiburg, Germany
Raja Chatila, ISIR—UPMC & CNRS, France
Francois Chaumette, INRIA Rennes—Bretagne Atlantique, France
Wan Kyun Chung, POSTECH, Korea
Peter Corke, Queensland University of Technology, Australia
Paolo Dario, Scuola S. Anna Pisa, Italy
Alessandro De Luca, Sapienza University Rome, Italy
Rüdiger Dillmann, University Karlsruhe, Germany
Ken Goldberg, UC Berkeley, USA
John Hollerbach, University Utah, USA
Lydia E. Kavraki, Rice University, USA
Vijay Kumar, University Pennsylvania, USA
Bradley J. Nelson, ETH Zürich, Switzerland
Frank Chongwoo Park, Seoul National University, Korea
S. E. Salcudean, University British Columbia, Canada
Roland Siegwart, ETH Zurich, Switzerland
Gaurav S. Sukhatme, University Southern California, USA
More information about this series at http://www.springer.com/series/5208

Satoshi Tadokoro
Editor
Disaster Robotics
Results from the ImPACT Tough Robotics
Challenge
123
Editor
Satoshi Tadokoro
Graduate School of Information Sciences
Tohoku University
Sendai, Japan
ISSN 1610-7438 ISSN 1610-742X (electronic)

Springer Tracts in Advanced Robotics
ISBN 978-3-030-05320-8 ISBN 978-3-030-05321-5 (eBook)
https://doi.org/10.1007/978-3-030-05321-5
Library of Congress Control Number: 2018964019
© Springer Nature Switzerland AG 2019

This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, express or implied, with respect to the material contained herein or
for any errors or omissions that may have been made. The publisher remains neutral with regard to
jurisdictional claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
To the victims of disasters
Preface
The ImPACT Tough Robotics Challenge (ImPACT-TRC) is a national project

funded by the Japan Cabinet Office from 2014 to 2018. It focuses on research and
development of robot technologies for emergency response, disaster recovery, and
damage prevention. This book introduces the major outcomes of this project.
Japan experienced enormous damage from the Great East Japan Earthquake and
the subsequent Fukushima Daiichi Nuclear Power Plant accident in 2011.
ImPACT-TRC organized a Field Evaluation Forum which brought together more
than 500 participants in one of the stricken cities, Minami-Soma City, on June 14,
2018. The city has not yet recovered from the damage. An old chef from the
restaurant where I had dinner told me, “My children and grandchildren would never
come back. My family has been separated. I cannot expect the small happiness of
my family anymore.”
In 2011, I donated three units of an unmanned ground vehicle called Quince to
the Tokyo Electric Power Company for investigation in the nuclear reactor build-
ings of the Fukushima Daiichi as the first national robot used there. Quince was
being developed by a consortium of Tohoku University, the Chiba Institute of
Technology (CIT), and the International Rescue System Institute (IRS) in a project
funded by the New Energy and Industrial Technology Development Organization
(NEDO). The decision of the donation was based on the fear that the reactors would
not be stabilized and the contamination might spread further. “If that would become
the case, we would not be able to live in Sendai, which is situated 100 km from the
plant, and possibly in Tokyo or potentially all over Japan,” the team members
considered. The original target of Quince was completely independent from such
nuclear accidents, and we did not have any duty on this mission apparently. I really
thank Prof. Eiji Koyanagi and Dr. Seiga Kiribayashi, who worked at CIT at that
time, and Profs. Keiji Nagatani, Kazunori Ohno, and Yoshito Okada of Tohoku
University, for their devoted contribution.
I started researching into rescue robotics in 1995 when I experienced the Great
Hanshin-Awaji Earthquake. Mr. Satoshi Fuji, who was a student of mine at Kobe
University, was buried under his house and was rescued after four hours. The
doctor initially told his parents, “He suffers from crush syndrome, and has no
vii
viii Preface
chance of survival. You have to give up.” He is lucky that he is still alive. Mr.
Motohiro Kisoi, a student of Prof. Fumitoshi Matsuno, passed away under the
debris. An American football player in the Kobe University team found a young
lady after hearing her voice from under a floor. He removed the tatami mats and the
planks from the wooden floor plates again and again, and finally found her. He tried
to drag her body out from the debris but he could not, despite his strength, because
her leg was trapped. A fire broke out and began to spread to his house. She asked
him to cut off her leg to save her, but he was unable to do so, and he was forced to
flee from the fire. “I left her to die…,” he said. His voice has been echoing in my
mind periodically since then.
When I led the DDT Project of the Japan Ministry of Education, young fire-
fighters in the Kobe Fire Department came to the Kobe Laboratory of the
International Rescue System Institute in 2003 to learn about rescue robots.
I remember our heated discussion on how robots can help search and rescue in the
future, what is needed, the conditions at disaster sites, the firefighters’ mission, and
so on. A few weeks later, I watched a TV news story reporting that four firefighters
had died in Kobe when a burning roof caved in on them. I was surprised to see their
names. One of the four was a firefighter whom I had met at the laboratory. I still
remember his young wife weeping as she held a newborn baby at his funeral.
What is our most important value for us? My personal opinion: human life.
The mission of the ImPACT-TRC is to develop technologies for saving lives and
minimizing the damages from disasters for the safety and security of humanity. As
the program manager, I am delighted to see that this 5-year project has produced
various world’s firsts, world’s bests, and world-class technical innovations. At the
same time, it is producing social and industrial innovations.
The research members have compiled overviews of the technical and scientific
results into this book. I recommend the readers to explore the original papers listed
in the references for more details.
I especially want to thank the researchers who have been collaborating together
to produce such excellent outcomes. The contributions of the Japan Cabinet Office,
the Japan Science and Technology Agency, the International Rescue System
Institute, Tohoku University, and other participating persons and organizations
have been significant.
Hoping for more safety and security supported by robotics.
Sendai, Japan Satoshi Tadokoro

October 2018 Professor, Tohoku University
President, International Rescue System Institute
Program Manager, ImPACT Tough Robotics Challenge
Contents
Part I Introduction and Overview

1 Overview of the ImPACT Tough Robotics Challenge
and Strategy for Disruptive Innovation in Safety
and Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Satoshi Tadokoro
Part II Disaster Response and Recovery

2 ImPACT-TRC Thin Serpentine Robot Platform
for Urban Search and Rescue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Masashi Konyo, Yuichi Ambe, Hikaru Nagano, Yu Yamauchi,
Satoshi Tadokoro, Yoshiaki Bando, Katsutoshi Itoyama,
Hiroshi G. Okuno, Takayuki Okatani, Kanta Shimizu and Eisuke Ito
3 Recent R&D Technologies and Future Prospective
of Flying Robot in Tough Robotics Challenge . . . . . . . . . . . . . . . . . 77
Kenzo Nonami, Kotaro Hoshiba, Kazuhiro Nakadai, Makoto Kumon,
Hiroshi G. Okuno, Yasutada Tanabe, Koichi Yonezawa,
Hiroshi Tokutake, Satoshi Suzuki, Kohei Yamaguchi,
Shigeru Sunada, Takeshi Takaki, Toshiyuki Nakata, Ryusuke Noda,
Hao Liu and Satoshi Tadokoro
4 Cyber-Enhanced Rescue Canine . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
Kazunori Ohno, Ryunosuke Hamada, Tatsuya Hoshi,
Hiroyuki Nishinoma, Shumpei Yamaguchi, Solvi Arnold,
Kimitoshi Yamazaki, Takefumi Kikusui, Satoko Matsubara,
Miho Nagasawa, Takatomi Kubo, Eri Nakahara, Yuki Maruno,
Kazushi Ikeda, Toshitaka Yamakawa, Takeshi Tokuyama,
Ayumi Shinohara, Ryo Yoshinaka, Diptarama Hendrian,
Kaizaburo Chubachi, Satoshi Kobayashi, Katsuhito Nakashima,
Hiroaki Naganuma, Ryu Wakimoto, Shu Ishikawa, Tatsuki Miura
and Satoshi Tadokoro
ix
x Contents
5 Dual-Arm Construction Robot with Remote-Control

Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
Hiroshi Yoshinada, Keita Kurashiki, Daisuke Kondo, Keiji Nagatani,
Seiga Kiribayashi, Masataka Fuchida, Masayuki Tanaka,
Atsushi Yamashita, Hajime Asama, Takashi Shibata,
Masatoshi Okutomi, Yoko Sasaki, Yasuyoshi Yokokohji,
Masashi Konyo, Hikaru Nagano, Fumio Kanehiro,
Tomomichi Sugihara, Genya Ishigami, Shingo Ozaki,
Koich Suzumori, Toru Ide, Akina Yamamoto, Kiyohiro Hioki,
Takeo Oomichi, Satoshi Ashizawa, Kenjiro Tadakuma,
Toshi Takamori, Tetsuya Kimura, Robin R. Murphy
Part III Preparedness for Disaster

6 Development of Tough Snake Robot Systems . . . . . . . . . . . . . . . . . 267
Fumitoshi Matsuno, Tetsushi Kamegawa, Wei Qi, Tatsuya Takemori,
Motoyasu Tanaka, Mizuki Nakajima, Kenjiro Tadakuma,
Masahiro Fujita, Yosuke Suzuki, Katsutoshi Itoyama,
Hiroshi G. Okuno, Yoshiaki Bando, Tomofumi Fujiwara
7 WAREC-1 – A Four-Limbed Robot with Advanced
Locomotion and Manipulation Capabilities . . . . . . . . . . . . . . . . . . . 327
Kenji Hashimoto, Takashi Matsuzawa, Xiao Sun,
Tomofumi Fujiwara, Xixun Wang, Yasuaki Konishi, Noritaka Sato,
Takahiro Endo, Fumitoshi Matsuno, Naoyuki Kubota, Yuichiro Toda,
Naoyuki Takesue, Kazuyoshi Wada, Tetsuya Mouri,
Haruhisa Kawasaki, Akio Namiki, Yang Liu, Atsuo Takanishi
Part IV Component Technologies

8 New Hydraulic Components for Tough Robots . . . . . . . . . . . . . . . . 401
Koichi Suzumori, Hiroyuki Nabae, Ryo Sakurai, Takefumi Kanda,
Sang-Ho Hyon, Tohru Ide, Kiyohiro Hioki, Kazu Ito, Kiyoshi Inoue,
Yoshiharu Hirota, Akina Yamamoto, Takahiro Ukida,
Ryusuke Morita, Morizo Hemmi, Shingo Ohno, Norihisa Seno,
Hayato Osaki, Shoki Ofuji, Harutsugu Mizui, Yuki Taniai,
Sumihito Tanimoto, Shota Asao, Ahmad Athif Mohd Faudzi,
Yohta Yamamoto and Satoshi Tadokoro
9 Simulator for Disaster Response Robotics . . . . . . . . . . . . . . . . . . . . 453
Fumio Kanehiro, Shin’ichiro Nakaoka, Tomomichi Sugihara,
Naoki Wakisaka, Genya Ishigami, Shingo Ozaki
Contents xi
Part V Evaluation and Human Factors

10 Field Evaluation and Safety Management of ImPACT
Tough Robotics Challenge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 481
Tetsuya Kimura, Toshi Takamori, Raymond Sheh, Yoshio Murao,
Hiroki Igarashi, Yudai Hasumi, Toshiro Houshi
11 User Interfaces for Human-Robot Interaction
in Field Robotics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 507
Robin R. Murphy and Satoshi Tadokoro
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 529
Part I
Introduction and Overview
Chapter 1
Overview of the ImPACT Tough Robotics
Challenge and Strategy for Disruptive
Innovation in Safety and Security
Satoshi Tadokoro
Abstract The ImPACT Tough Robotics Challenge (ImPACT-TRC) is a national

project of the Japan Cabinet Office (2014–2018, 62 PIs and 300 researchers, 30
MUSD/5 years) that focuses on tough robotic technologies to provide solutions to
disaster response, recovery, and preparedness. It consists of sub-projects of six types
of robot platforms and several component technologies integrated with the robots.
One of them is the Cyber Rescue Canine suits for monitoring dogs’ behavior and com-
manding their movement, which has shown high effectiveness in regular exercises
of the Japan Rescue Dog Association. Another platform is a new serpentine robot,
Active Scope Camera, which can crawl and levitate in gaps of a few cm to search in
rubble piles. Structural assessment and radiation measurement were performed by
this robot in Fukushima-Daiichi from December 2016 to February 2017. The other
serpentine robots showed high mobility in ducts, in and out of pipes, on uneven ter-
rain, and on vertical ladders, and climbed a 1-m-high step by a 1.7-m-long body. The
Omni Gripper can grasp a wide variety of targets, even with sharp edges, without
the need for precise control by using the jamming phenomenon. The robust flight of
a new drone, PF-1 under difficult conditions contributed to the response operations
in the Northern Kyushu Heavy Rain Disaster by gathering high-resolution images
of inaccessible areas in July 2017. The WAREC-1 can move on four legs or on two
legs, or crawl, and can climb vertical ladders as well. The Construction Robot has a
double-swing dual-arm mechanism, operator assistance by bird’s-eye view images
created by a drone and multiple cameras, and assistance by force and touch feed-
back. It can perform both high-power tasks and precise tasks remotely. All of these
technologies have been demonstrated at the Field Evaluation Forums, which have
been organized twice a year since the beginning of the project. These forums have
promoted the communication between researchers, production companies, service
providers, and users in order to achieve disruptive innovation not only in technology
but also in industry and society.
S. Tadokoro (B)
JST/Tohoku University, 6-6-01 Aramaki-Aza-Aoba, Aoba-ku, Sendai 980-8579, Japan
e-mail: tadokoro@rm.is.tohoku.ac.jp
© Springer Nature Switzerland AG 2019 3

S. Tadokoro (ed.), Disaster Robotics, Springer Tracts in Advanced Robotics 128,
https://doi.org/10.1007/978-3-030-05321-5_1
4 S. Tadokoro
1.1 Challenge of Disaster Robotics
Our human society faces a serious threat from natural and man-made disasters that
have frequently occurred in recent times. Robots are expected to be an advanced
solution for information gathering and disaster response actions. However, there are
important issues associated with robots that must be solved for achieving sufficient
performance in the disaster situations and for their deployment in responder stations.
The three expected functions of disaster robots are (1) to assist workers in performing
difficult tasks, (2) to reduce human risks, and (3) to reduce cost and to improve effi-
ciency, of the three activities: emergency response operations right after the outbreak
of disasters, such as search and rescue; damage recovery, such as that of construction
works; and damage prevention, such as daily inspection.
However, many robot technologies require certain environmental conditions for
achieving good performance. They are fully functional in factories and offices
because an adequate environment is set up. However, they cannot work in disas-
ter environments that are extreme and unforeseen. We may call the current robotics
a spineless honor guy.
ImPACT is a strategic political investment of the Japan Cabinet Office to solve
specific social problems. The ImPACT Tough Robotics Challenge (ImPACT-TRC),
as one of the ImPACT projects, aims at making the robotics tougher so that they can
function under difficult situations. It challenges to ease the necessary conditions for
robotics to work in disaster.
The ImPACT-TRC started at the end of 2014 and will finish in March, 2019. 62
research groups form five working groups and two research committees. It shows the
research progress being made to general public at the ImPACT-TRC Field Evaluation
Forums, which are held twice a year.
This chapter introduces the innovation that this project is targeting, the approach
to realize this goal, and an overview of the major achievements at the time of writing.
1.2 Five Types of Robots
The Council on Competitiveness-Nippon (COCN) established the Disaster Robot

Project after the Great Eastern Japan Earthquake to analyze the needs and issues
of disaster robots. More than 50 companies, universities, and research institutes
intensively collaborated to draw the roadmaps shown in Table 1.1.
This project investigated various possible situations caused by seven types of
disasters. It clarified the necessary functions and performance, current levels, techni-
cal problems, implementation problems, future perspective, evaluation metrics and
methods, and strategy for social use. It included regulations and social systems on
the use of drones, allocation of wireless frequencies for disaster robots, standard
performance test methods and test fields, anti-explosion methodologies and its stan-
dardization, and a parts database, and technology catalogues. The Japan Government
has institutionalized many of them.
1 Overview of the ImPACT Tough Robotics Challenge and Strategy … 5
Table 1.1 Roadmaps drawn by the council on competitiveness-Nippon (COCN)

Year Roadmap
2011 Robots for nuclear accident response and decommissioning
2012 Robots for general disaster response, recovery and prevention
2013 Performance evaluation, technology database, and center for disaster response robots
Fig. 1.1 Image of research goals of the ImPACT Tough Robotics Challenge
The research plan of the ImPACT Tough Robotics Challenge is one of the results
of the analysis and discussions of this project.
Figure 1.1 shows an image of the goals of the ImPACT-TRC. This project
researches into five types of robots: aerial robots, serpentine robots, construction
robots, legged robots, and Cyber Rescue Canine suits, as well as component tech-
nologies onboard and on the network.
This project focuses on the following technical issues to make the robots tougher.
1. Accessibility in extreme environment
Accessibility to and in the site is limited in a disaster environment. Various issues
need to be solved as a system in order to achieve the high accessibility. They
include mobility and actuation for mechanical movement, sensing, human inter-
faces, and robot intelligence for robot autonomy and operators’ situation aware-
ness.
2. Sensing in an extreme environment
Sensing of a situation is difficult in a disaster environment. For example, sight
under darkness, fog, rain, direct sunlight, inverse light and fire is needed. Hearing
under external noise and the sound produced by the robot motion is required.
3. Recovery from task failure
Recovery from failure is necessary in order to complete a task. The entire task
may fail even if one part of a robot component does not work well. Recovery
is possible only in the cases when all the failure modes have been known and
their countermeasures are planned beforehand. However, in the disaster fields, a
6 S. Tadokoro
robot must be able to return even if one of its motors does not work. Similarly,
the robot’s position must be estimated even if its localization module temporarily
fails.
4. Compatibility with extreme conditions
Robot technologies sometimes do not work in tough disaster environments. The
necessary conditions of the robot technologies must be eased so that the robots
can work in disaster response and recovery.
1.3 Use Scenario
Figure 1.2 shows a use scenario of robots.

Emergency response is crucial during the acute phase immediately after the out-
break of disasters. Information gathering and analysis is needed to aid decision
making at the disaster management center and at the on-site operations coordina-
tion centers. The aerial vehicles, PF-1, assume an important role in the first stage
of surveillance for gathering overview information of the disaster in critical areas
where damage might occur. The advantages of the PF-1 are that they are robust under
Fig. 1.2 Use scenario of the robot systems in the timeline before and after the outbreak of a disaster.
The arrow indicates time, and the blue labels show the transition of disaster phases. The yellow
boxes represent missions, and the black words represent users. The robots are shown in red. Green
text explains the difficult tasks and conditions. The performance metrics are written in purple
bad weather conditions, fly at night when manned helicopters cannot fly, and create
less noise in order to avoid obstructing survivor search. The Cyber Rescue Canine
suit provides capabilities of monitoring and guidance of rescue dogs during survivor
search operations under debris of collapsed buildings and landslides. The serpentine
robot, Active Scope Camera, is used for investigating situation in debris and search-
ing for survivors. They need high mobility and recognition capability in the debris
environments. Urgent recovery construction work at risky sites is supported by the
Construction Robot, which has both high power and preciseness, as well as good
situational awareness of the operators.
Preparation before the outbreak is important for preventing damage. For example,
inspection of infrastructure and industrial facilities is needed. Robots can reduce the
cost and risk of the inspection task by supporting or substituting human workers.
Serpentine robots are used in the inspection of pipes and ducts of plants where
conventional tools are not useful. Legged robots are used for surveillance of risky
areas where humans cannot enter.
These robots have to be deployed in the disaster prevention organizations and
companies so that these tasks can be achieved. It is important that the responders and
workers practice well in order ensure skilled use of the robots. In emergencies, robot
engineers are of no use. The robots must be ready immediately when the responders
arrive at the mission site.
Figure 1.3 shows an example of the goals of the ImPACT Tough Robotics Chal-
lenge in the case of large-scale earthquake disasters. The information gathering,
Fig. 1.3 Robotics needs and potential contributions of the ImPACT Tough Robotics Challenge in
emergency response in earthquake disasters
8 S. Tadokoro
Fig. 1.4 Robotics needs and potential contribution of the ImPACT Tough Robotics Challenge in
plant inspection and damage prevention
search and rescue, and construction are needed at emergency response sites as shown
in the blue boxes. Advanced equipment such as drones, rescue dogs, video scopes,
and remote construction machines are being used at present. However, as shown in
the red boxes, drones have a risk of fall and crash, although they are effective for rapid
information gathering. Fragility under heavy rain and wind is also a serious problem.
Rescue canines can effectively search survivors by smell, but they do not bark only
when they sense survivors; they may bark for other various reasons. Handlers have
to stay near the dogs because they may lose their locations when they go far away to
search survivors. Video scopes are used for searching survivors in confined spaces of
debris. They cannot be inserted deep into large debris, and it is difficult to estimate
their position in an occluded space. Remote construction machines are effective for
construction at risky sites. However, they cannot move in difficult terrain such as
steep slopes, and their efficiency and accuracy are inferior to manned machines.
As the green boxes show, the ImPACT-TRC aims at solving such difficulties in
disaster environments so that robots become useful for emergency response and
recovery after the occurrence of a disaster.
Figures 1.4 and 1.5 respectively show the case study of the response to the
Fukushima-Daiichi Nuclear Power Plant Accident, and of the application to plant
inspection.
Based on the above analysis, the goals and the objectives of each robot were
determined as follows.
Fig. 1.5 Robotics needs and potential contribution of the ImPACT Tough Robotics Challenge in
the Fukushima-Daiichi nuclear power plant accident
1. Cyber Rescue Canine

Implementation: Drastic improvement in rescue dog’s efficiency by a Cyber
Rescue Canine suit, and deployment in rescue parties across the world.
Project Goal: Development of Cyber Rescue Canine suits for monitoring,
mapping, commanding, and estimating dogs’ behavior and conditions. Train-
ing with the Japan Rescue Dog Association and potential users for raising the
Technology Readiness Level (TRL) according to their feedback.
2. Serpentine Robots
Implementation: Use in debris and narrow complex parts of facilities for

search and rescue, investigation, and inspection.
Project Goal: Achieve mobility in debris where access is difficult such as in
collapsed houses, in complex industrial facilities with complex pipes and
equipment, and in houses on fire. Measurement, communication, recognition,
and mapping of situations for assisting search and rescue, inspection and
extinguishment.
3. Legged Robot
Implementation: Development of practical technologies for legged robots for
investigation and inspection in damaged facilities at risk.
10 S. Tadokoro
Project Goal: Achieve mobility in facilities, such as climbing stairs and ladders
up and down; performing non-destructive inspection, such as ultrasonic flaw
detection; and performing repair tasks, such as boring using a hammer drill
or opening/closing valves.
4. Aerial Robot
Implementation: Development of new services by the tough aerial robots supe-

rior to robots in the past.
Project Goal: Achieve robust flight under difficult conditions, such as heavy
rain (100 mm/h) and wind (15 m/s), and navigation near obstacles (dis-
tance: 30 cm). Assistance for task execution by measurement, communica-
tion, recognition, and mapping of situations.
5. Construction Robot
Implementation: Improvement of efficiency and safety of tasks of disas-

ter recovery tasks, mine development, and urban construction by remote/
autonomous dual arms.
Project Goal: Achieve mobility that has been impossible by conventional
remote autonomous construction machines such as traversing gaps and climb-
ing slopes, and assistance for execution of heavy but dexterous tasks using
both arms.
ImPACT-TRC is different from curiosity-driven fundamental research programs

such as the Grants-in-Aid for Scientific Research sponsored by the Ministry of Educa-
tion, Culture, Sports, Science and Technology of Japan. It was planned by conducting
a backtracking analysis of the problems to reveal what should be done so that the
research results are used in our society and effective solutions are provided for dis-
ruptive innovation. Therefore, the evaluation metric is not the number of research
papers published with extensive references but the impact that the research has on
our society and industry.
1.4 Disruptive Innovations
ImPACT is a political research and development project planned by the Japan Cabinet
Office as a part of its development strategies. It aims at creating disruptive innovations
for Japan’s revival.
The ImPACT-TRC targets at the following three disruptive innovations that should
be promoted for solving this serious social problem of disasters.
1. Technical Disruptive Innovation

To create tough technologies that are effective for difficult disaster situations,
five types of robot platforms and payload technologies are developed, and their
Fig. 1.6 Problems of technology cycle in the disaster robotics field, and contribution of the ImPACT
Tough Robotics Challenge for innovation
effectiveness are verified at Field Evaluation Forums using simulated disaster

environments to establish the tough robotics.
2. Social Disruptive Innovation
To contribute to the advancement of damage prevention, emergency response,
and damage recovery of disasters, it provides robotic solutions and fundamentals
for minimizing damage by assisting information gathering and mission execution
under extreme conditions.
3. Industrial Disruptive Innovation
To propagate the tough fundamental technologies to outdoor field industries, it
provides an environment for the creation of new business related to its com-
ponents, systems, and services. Application of the technologies to businesses
promotes a technology cycle of disaster robotics.
Disaster robotics has a high social demand. However, it is not driven by an estab-
lished market and is not economically self-sustained. Its market size is small. There-
fore, the field of disaster robotics has the following problems, as shown in Fig. 1.6.
Disaster robotics has the fundamental issue of how to fuel the necessary technology
cycle and how to create the needed disruption innovation.
12 S. Tadokoro
1. Industries
Disaster robots are procured by governments and local governments, and the
market is based on governmental policy. The market size is small and the products
do not have enough volume efficiency. Robots need the integration of a wide
variety of technologies, and the cost of their development and maintenance is
high.
2. Users
Users do not have enough knowledge and awareness of what robots can do, and
what limitation they have. Users’ budget of procurement is limited regarding
disaster robots.
3. General Public
The general public has recognized the necessity of disaster robots. In some cases,
their expectations are too high, and in other cases, they have negative opinions
with groundless biases.
4. Researchers and Developers
The problems related to disasters are technically difficult, and the capability of
disaster robots is not sufficient. The technologies are not directly connected with
the market. Universities usually challenge such problems, but the researchers
occasionally do not focus on real use cases considering actual conditions and
requirements, although these are the most important technical challenges in this
field.
For these reasons, the technology cycle has deadlocked, and the innovation rate
for disruptive technologies has not been sufficiently fast.
In order to resolve this discrepancy, the ImPACT-TRC offers the following.
1. Industries
The research results are widely introduced and demonstrated to the industry in
realistic situations. This opens the way for industry to utilize them for new busi-
ness, and for new solutions to current problems. This integrates the market of
disaster robotics with the large business markets.
2. Users
Disaster robotics is explained to the actual and potential users through tests con-
ducted at simulated disaster situations, by applying them to real disasters, and by
asking for user evaluation, so that users recognize the capabilities and limitations
of the robots. Collaborative improvement of robot capabilities leads procurement
and deployment of disaster robotics.
3. General Public
Open demonstration of R&D progress and results is performed. It promotes the
general public’s recognition and understanding.
The research field is established. It forms a good environment where researchers
can study into disaster robotics. Evaluation metrics are developed by user-oriented
research and collaboration with users and industries.
The technology catalogue is published periodically. It shows the following infor-

mation of research outcomes from the viewpoint of the users by following the New
Technology Information System (NETIS), a database of the Ministry of Land, Infras-
tructure and Transportation of Japan (MLIT) for procurement.
Search Items: Disaster category, task, portability, technology category, use envi-
ronment, and past use case.
Fundamental Information: Name, functions, performance, photos, size, weight,
date of development, research project, and contact information.
1.5 Field Evaluation Forum
The outcome of this project is evaluated and demonstrated at the Field Evaluation
Forum (FEF) that was organized twice a year both outdoors and indoors at Tohoku
University from 2015 to 2017, and twice a year outdoors at Fukushima Robot Test
Field (Fukushima RTF) in 2018. It consists of open demonstrations and closed eval-
uations. In the open part, the robots and technologies are tested for demonstration
in front of a general audience using mock collapsed debris and industrial facilities.
At the closed part, new risky and fragile technologies are tested, and researchers are
provided with feedback from specialists and users.
Figure 1.7 shows pictures taken at the FEF at Fukushima RTF on June 11, 2018.
The results of each FEF are summarized by movies on YouTube ImPACT Tough
Robotics Challenge channel [1–5].
The objectives of the FEF are summarized as below.

• To enhance their motivation by showing their own research progress and
watching the others’ research progress as hands-on demonstrations.
• To listen to users for opinions and evaluations, which are valuable for adjusting
research directions.
• To promote integration of component technologies into the platforms.
Fig. 1.7 Field evaluation forum

14 S. Tadokoro
• To understand other research results to extend systems through new research

cooperation.
• To nurture the mind, thus providing not only their own research results of a
limited scale, but also the big synthetic solutions applicable to real cases.
• To advertise excellent research outcomes widely and internationally to foster
young researchers’ reputation for their future.
2. Users
• To understand the robots’ capabilities and limitations by watching the moving
research results.
• To support procurement, deployment, and future planning.
• To find expert partners for seeking advice on robotics and related technologies.
3. Industries
• To gain insight into new business opportunities by watching actually working
(and not working) technologies in real systems.
• To find opportunities for testing, collaborative research, and technical advice
in order to solve their own problems, or to start new business.
4. General Public
• To feel the future safety and security technologies for damage prevention,
emergency response, and recovery by watching robots in action.
At the FEF on November 11, 2017, a synthetic demonstration was performed by
assuming the following earthquake disaster scenario.
1. Initial Information Gathering, Transportation of Emergency Goods

• A number of collapsed houses and landslides are observed. The whole situation
is not known.
• The emergency management center and on-site operations coordination cen-
ters (OSSOC) open.
• The aerial robots autonomously fly to gather wide-area information by speci-
fying a route plan.
• Ortho-images and 3D images are generated from the photos taken by the aerial
robots.
• An aerial robot transports emergency medicine.
2. Road Clearance
• The OSSOC plans emergency actions including search & rescue, road clear-
ance, debris removal, etc.
• Construction robots cut and remove obstacles.
3. Search for Survivors
• Automatic finding of personal effects of survivors is carried out by a Cyber
Rescue Canine unit.
• The Cyber Rescue Canine unit finds survivors, and shares the information
with the OSSOC.
4. Rescue of Survivors
• The Active Scope Camera, a serpentine robot, investigates the inside of the
debris, hears the voice of a survivor, and identifies the position.
• Firefighters enter the debris, rescue the survivors, and transfers them to the
medical facilities.
The above-mentioned research strategy of the ImPACT-TRC was in success for

the open innovation for the technical, social and industrial disruptive outcomes as a
driving force.
1.6 Major Research Achievements
Research conducted over 3.5 years have produced outstanding outcomes, some of
which are the world’s first, the world’s best, and the world class, as listed up below.
Note that the world’s first and the world’s best are shown based on the author’s knowl-
edge in this domain at the moment of writing, and might include misunderstandings
due to ignorance. These projects used various methods in robotics including soft
robotics and deep learning.
1. Cyber Rescue Canine

• Cyber Rescue Canine suit that monitors and commands a dog’s behavior.
(World’s First)
• A non-invasive method of commanding the dog to perform an action. (World’s
First)
• Lightweight suit by which the dog does not feel fatigue. (World’s Best)
• Visual 3D self-localization and mapping using rapidly moving images taken
by the onboard cameras. (World’s Best)
• Estimation of emotion, including willingness, by the dog’s heartbeat and accel-
eration. (World’s Best)
• Estimation of the dog’s movement and action. (World’s Best)
• Remote onboard image transfer.
• Automatic discovery of personal effects.
• Frequent regular exercise with the Japan Rescue Dog Association.
2. Serpentine Robots (Thin)
• New Active Scope Camera, a serpentine robot to investigate inside debris by
moving and levitating in gaps of a few centimeters. (World’s First)
• Levitation of the serpentine body to get over debris obstacles. (World’s First)
• Dragon Firefighter, a flying robot extinguishing hose (Fig. 1.8). (World’s First)
16 S. Tadokoro
Fig. 1.8 Dragon Firefighter prototype 3 m long at field evaluation forum on June 14, 2018
• Sound processing for hearing survivors’ voice in debris by removing noise.

(World’s First)
• Realtime estimation of shape of the snake body by audition. (World’s First)
• Remote control with tactile sensing for the body surface. (World’s First)
• Visual 3D self-localization and mapping in narrow spaces, with rapid pose
change and moving lighting in the small body sizes. (World’s Best)
• Fast motion in pipes by pneumatic actuation for anti-explosion.
• Automatic recognition of goods and discovery of personal effects in debris.
• Use of the Active Scope Camera for investigation in Fukushima-Daiichi
Nuclear Power Plant.
3. Serpentine Robots (Thick)
• Climbing up and down ladders. (World’s First)
• Motion in and out of pipes, ducts, and on rough terrain.
• Omni-Gripper, a soft robot hand that can grasp, push, and hook a wide variety
of objects even with sharp edges like knives without precise control (Fig. 1.9).
(World’s First)
• Climbing a step 1 m high by a body 1.7 m long (Fig. 1.9). (World’s Best)
• Self-localization and mapping in pipes.
• Sensor sheet for distributed tactile and proximity sensing on the body surface.
• Testing at actual and simulated industrial plants.
4. Legged Robot
Fig. 1.9 Wheel-type Serpentine Robot and Omni-Gripper at field evaluation forum on June 14,
2018
Fig. 1.10 Legged Robot and Construction Robot at field evaluation forum on June 14, 2018
• Four-legged robot that can move in a plant and perform inspection remotely
and autonomously (Fig. 1.10).
• Robot hands of 30-cm size that can keep grasping 50-kg objects without
electricity. (World’s First)
• Opening and closing valve with torque 100-Nm by a legged robot. (World
Class)
• Moving in four legs, in two legs, or crawling. (World Class)
• Climbing vertical ladders.
• Virtual bird-eye view image for teleoperation using recorded past images.
• 3D self-localization and mapping including environments.
• Generation of a sound source map.
• Estimation of surface conditions of objects by whisking.
• Testing of functions at Field Evaluation Forum.
5. Aerial Robot
• Robust flight for information gathering under difficult conditions. (World
Class)
• Hearing and identification of voice from ground during flight using an onboard
microphone array. (World’s First)
18 S. Tadokoro
• Environmental robustness (wind 15 m/s, rain 300 mm/h, and navigation near
structures with 1-m distance). (World Class)
• Continuous flight with 2 stopped propellers. (World Class)
• Load robustness (height change of 50 mm with a step weight change of 2 kg).
(World Class)
• Onboard hand and arm that maintain the position of the center of gravity
during motion.
• Wireless position sharing system for aerial vehicles.
• High precision 3D map generation using multiple GPSs.
• Hierarchical multi-resolution database for 3D point cloud.
• Use at Northern Kyushu Heavy Rain disaster for capturing high-resolution
images (1 cm/pixel) in the area of difficult accessibility in Toho Village,
Fukuoka Prefecture in Japan.
6. Construction Robot
• Double-swing dual-arm mechanism enabling dexterous but heavy work
(Fig. 1.10). (World’s Best)
• High power and high precision control necessary for task execution using two
arms. (World Class)
• Durable force and tactile feedback with no sensor at hand. (World Class)
• Pneumatic cylinder with low friction. (World Class)
• High power hand for grasping and digging.
• Realtime bird-eye-view image by drone.
• Virtual bird-eye-view image by multiple cameras onboard.
• Vision through fog.
• Immersive remote control cock-pit.
• Testing of functions at Field Evaluation Forum.
These outcomes contribute to the resolution of the difficulties of users, as shown

in Table 1.2.
1.7 Actual Use in Disasters
The Northern Kyushu Heavy Rain disaster on July 5–6, 2017 caused 36 fatalities in
Fukuoka Prefecture and Oita Prefecture, and 750 collapsed houses in a wide area.
The ImPACT-TRC team gathered information by an aerial robot on July 7–8, and
contributed to the disaster response. A drone PF-1 developed by Autonomous Control
Systems Laboratory Ltd. (ACSL) took high-resolution photos (1 cm/pixel) in a valley
area 3 km long at a speed of 60 km/h beyond visual range by specifying waypoints.
Ortho-images, as shown in Fig. 1.11, as well as the high-resolution photos were
provided to the Fire and Disaster Management Agency (FDMA) and the National
Research Institute for Earth Science and Disaster Resilience (NIED), and were used
for disaster prevention.
Table 1.2 Examples of user needs and solutions provided by the ImPACT Tough Robotics Chal-
lenge
Solutions User needs Outcome
Cyber Rescue Canine suit that Rescue dogs must be near their [Search and Rescue] Rescue
monitors dog’s behavior and handlers because a dog’s dogs can be used a few
conditions, and commands behavior and conditions are kilometers away. (World’s
actions not known remotely. Action is First)
not recorded and cannot be
reported in detail
Intrusion of Active Scope The area of investigation is [Search and Rescue,
Camera into a few-centimeters limited because of insufficient Emergency Response, Damage
gap by self-ground motion and mobility. Position inside the Prevention] ASC enhances its
levitation debris cannot be measured. area of investigation and
Unable to listen to survivors’ features bird’s eye view by
voice and to construct a map levitation. It is able to listen to
survivors’ voice and construct
a map. (World’s First)
Serpentine robots that can The area of investigation is [Damage Prevention,
move through plant pipes, limited and cannot cover the Emergency Response] The
ducts, rough terrain, steep whole plant because of robot can reach many critical
stairs, steps, ladder and insufficient mobility places in plants for visual
confined spaces inspection. (World’s First)
Grasping, pushing, and Hand must be changed to [Damage Prevention,
hooking without control adapt targets. Speed is slow Emergency Response, Search
because complex control is and Rescue] The hand can
necessary. Motion planning is easily and quickly grasp a
needed to adapt to various wide variety of objects even if
objects at the disaster site they have sharp edges.
(World’s First)
Robot hand of 30 cm size that There is no small-size, [Damage Prevention,
can grasp objects weighing high-power hand for disaster Emergency Response,
50 kg without electricity and factory applications. Heat Recovery] The hand continues
is a serious problem in tasks grasping without electricity
that require a large grasping with a force of 150 N per a
force finger maintaining low
temperature. (World’s First)
The PF-1 was used also for gathering information of land slides at the Western
Japan Heavy Rain Disaster on July 25–26, 2018.
As a prototype of the Active Scope Camera (ASC), a thin serpentine robot was used
from December 2016 to February 2017 for investigating inside the nuclear reactor
building of the Fukushima-Daiichi Nuclear Power Plant Unit 1, which exploded in
March 2011. It was suspended by a crane system and entered into the debris through
boreholes and gaps in the structures; it captured images using its onboard camera
mounted on its tip, as shown in Fig. 1.12. The situation of the roof structure and a fuel
transfer machine, as well as the shift of a well plug above the pressure containment
vessel, were checked, and 3D models were produced. A dose meter installed at the
20 S. Tadokoro
Fig. 1.11 Part of the ortho-image of Toho village, Fukuoka prefecture on 8 July 2017 during the
Northern Kyushu heavy rain disaster
Fig. 1.12 Image captured by the active scope camera in Fukushima-Daiichi nuclear power plant
[6]
tip of the ASC measured the dose rate of radiation in the well plug. These data were
used for the planning of decommissioning works and construction.
The wheel-type serpentine robot with Omni-Gripper entered a housed collapsed
at the Western Japan Heavy Rain Disaster on July 25–26, 2018. It extracted valuables
in the house according to the guidance of a resident.
The Cyber Rescue Canine has been regularly tested at the exercises of the Japan
Rescue Dog Association. Their certified dogs wear the Cyber Rescue Canine suit
and perform the training missions. Lessons learned at the exercises are fed back to
Fig. 1.13 Testing of the Cyber Rescue Canine suit with the National mountain rescue party of Italy
(Corpo Nazionale Soccorso Alpino e Speleologico; CNSAS)
the researchers in order to improve the suit, thus increasing its technology readiness
level. During the tests performed by the National Mountain Rescue Party of Italy
(Corpo Nazionale Soccorso Alpino e Speleologico; CNSAS), in collaboration with
the FP7 SHERPA Project, their rescue dog wore the Cyber Rescue Canine suit, and
searched for a survivor hidden on the slope of a mountain, as shown in Fig. 1.13.
It showed that the handler could easily monitor the dog’s position from a remote
site, and could identify the target at which the dog was barking. Issues for the future
deployment were discussed after the test. The Cyber Rescue Canines stood by for
response at multiple disasters in Japan since 2017.
The Construction Robot and the Legged Robot were selected as simulation plat-
form robots of the Tunnel Disaster Challenge of the World Robot Summit (WRS), a
Robot Olympics.
The research outcomes have been propagated across industries. More than 20
companies are testing the technologies for their own applications, and research col-
laborations with the ImPACT-TRC researchers have started.
1.8 Conclusions
This paper introduced an overview of the technical, social and industrial disruptive
innovations of the ImPACT Tough Robotics Challenge. The author, as the program
manager, hopes that this project contributes to the safety and security of humanity.
Acknowledgements The ImPACT Tough Robotics Challenge is conducted by a number of domes-

tic and international researchers, collaborative companies and users, staffs for management and field
testing, the Japan Cabinet Office, the Japan Science and Technology Agency, the International Res-
22 S. Tadokoro
cue System Institute, and many supporters. I, as the program manager, sincerely appreciate all their
contributions.
References
1. YouTube movie of FEF in (June 2018). https://youtu.be/EHub_hVVLj0

2. YouTube movie of FEF in (Nov 2017). https://youtu.be/a-r0wCFK8NQ
3. YouTube movie of FEF in (June 2017). https://youtu.be/3uI7YB3zY4o
4. YouTube movie of FEF in (Nov 2016). https://youtu.be/GY8RNwjFs2k
5. YouTube movie of FEF in (June 2016). https://youtu.be/oTLKhTNjv7U
6. TEPCO: Investigation Report (intermediate) of the Operating Floor of the Fukushima-Daiichi
Nuclear Power Plant Unit 1, Webpage of Tokyo Electric Holdings Company (30 March, 2017)
Part II
Disaster Response and Recovery
Chapter 2
ImPACT-TRC Thin Serpentine Robot
Platform for Urban Search and Rescue
Masashi Konyo, Yuichi Ambe, Hikaru Nagano, Yu Yamauchi, Satoshi

Tadokoro, Yoshiaki Bando, Katsutoshi Itoyama, Hiroshi G. Okuno, Takayuki
Okatani, Kanta Shimizu and Eisuke Ito
Abstract The Active Scope Camera has self-propelled mobility with a ciliary vibra-
tion drive mechanism for inspection tasks in narrow spaces but still lacks necessary
mobility and sensing capabilities for search and rescue activities. The ImPACT-TRC
program aims to improve the mobility of ASC drastically by applying a new air-jet
actuation system to float ASC in the air and integrate multiple sensing systems, such
M. Konyo (B) · Y. Ambe · H. Nagano · Y. Yamauchi · S. Tadokoro · T. Okatani

K. Shimizu · E. Ito
Tohoku University, 6-6-01 Aramaki Aza Aoba, Aoba-ku, Sendai-shi, Miyagi 980-8579, Japan
e-mail: konyo@rm.is.tohoku.ac.jp
Y. Ambe
e-mail: ambe@rm.is.tohoku.ac.jp
H. Nagano
e-mail: nagano@rm.is.tohoku.ac.jp
Y. Yamauchi
e-mail: yamauchi.yu@rm.is.tohoku.ac.jp
S. Tadokoro
T. Okatani
e-mail: okatani@vision.is.tohoku.ac.jp
K. Shimizu
e-mail: shimizu@vision.is.tohoku.ac.jp
E. Ito
e-mail: ito@vision.is.tohoku.ac.jp
Y. Bando
National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7, Aomi, Koto-ku,
Tokyo 135-0064, Japan
e-mail: y.bando@aist.go.jp
K. Itoyama
Tokyo Institute of Technology, 2-12-1 Ookayama, Meguro-ku, Tokyo 152-8552, Japan
e-mail: itoyama@ra.sc.e.titech.ac.jp
H. G. Okuno
Waseda University, 3F, 2-4-12 Okubo, Shinjuku-ku, Tokyo 169-0072, Japan
e-mail: okuno@nue.org

https://doi.org/10.1007/978-3-030-05321-5_2
26 M. Konyo et al.
as vision, auditory and tactile sensing functions, to enhance the searching ability. This
paper reports an overview of the air-floating-type Active Scope Camera integrated
with multiple sensory functions as a thin serpentine robot platform.
2.1 Overview of Thin Serpentine Robot Platform
2.1.1 Introduction
In a great disaster, rapid search and rescue of victims trapped in collapsed buildings is
one of the major challenges in disaster first response. Video scopes and fiberscopes,
which are widely used in urban search and rescue, have a limit to access to deep inside
the rubble. For example, they often get stuck on obstacles and cannot surmount steps
and gaps because the cable of scopes are so flexible that necessary force to insert
is not delivered by just pushing them from outside of the rubble. The authors have
developed a long flexible continuum robot called the Active Scope Camera (ASC) to
search in a narrow confined space for urban search and rescue [22, 49]. The ASC can
self-propel forward with the ciliary vibration drive [39], which generates propulsion
force by vibrating tilted cilia wrapped around the flexible robot. The ASC has been
used in several disaster sites, such as a survey after the 2016 Kumamoto earthquake
[2], surveys for the Fukushima Daiichi Nuclear Power Plant in 2017 by using a
vertical exploration type ASC [19].
In the ImPACT Tough Robotics Challenge (ImPACT-TRC) program, the authors
have developed a new thin serpentine robot with the ciliary vibration drive for advanc-
ing the mobility of the robot dramatically and applying new sensing technologies to
gather necessary information for search and rescue missions.
For advancing the mobility, we developed a new technology to fly the tip of the
robot by air injection to surmount the steps and gaps in debris (in Sect. 2.2). To our
best knowledge, the realization of a snake-like robot that flies in the air by such air
injection is the world first trial. This jet injection technology is also applied to a flying
hose robot with water-jets (in Sect. 2.2.6).
As for the new sensing technologies to gather the necessary information useful
for search and rescue operations, we developed a thin serpentine robot platform that
integrates the air-jet propulsion system and multiple sensing functions. The robot
can gather multiple sensory information with a microphone/speaker array as acoustic
information, IMU sensors and vibration sensor as kinesthetic/tactile information, and
a high-speed camera with high sensitivity as visual information. First, we developed
a speech enhancement system for search victims’ voice effectively and sound-based
posture (shape) estimation method to localize microphones and speakers mounted
on the robot (in Sect. 2.3). Second, we develop a Visual SLAM with the high-speed
camera to visualize 3D environmental structures and localize the robot in confined
space like debris (in Sect. 2.4). We also integrated the image recognition system,
which is the same technology described in Sect. 4.3, to detect a sign of victim visually.
2 ImPACT-TRC Thin Serpentine Robot Platform for Urban Search and Rescue 27
Finally, we develop a tactile sensing system with the vibration sensors to estimate the
contact points and support the operation with the vibrotactile feedback (in Sect. 2.5).
In this chapter, we introduce the overview of the ImPACT-TRC thin serpentine
robot platform and detailed technologies applied to the platform.
2.1.2 Concept of ImPACT-TRC Thin Serpentine Robot

Platform
The thin serpentine robot platform developed with ImPACT - TRC program enhances
the mobility and sensing capabilities of the conventional Active Scope Camera
(ASC). We call this robot ‘ImPACT-ASC’ in this chapter.
Conventionally, the Active Scope Camera (ASC) adopts the ciliary vibration drive
mechanism to generate propelling force on the flexible cable against the contact
grounds [39]. The ciliary vibration drive activates the tilted cilia wrapped on the
whole surface of the body with small vibration motors attached inside the body. The
vibration produces bending and recovery movement of cilia, and then rapid transitions
of stick/slip at the tip of cilia on the ground generate driving force forward because the
tilted cilium has an asymmetric friction property. Although the thrust force depends
on the contact material, the obtained propulsive force is approximately in several N/m.
The ciliary vibration drive has many advantages such as flexible, smart structure, and
lightweight. Especially, this mechanism has a significant benefit to avoid the stick
on the rubble due to the friction because the driving force increases when the contact
area increases.
The ImPACT-ASC developed is a long flexible tube-type robot with a length of
7 – 10 m and a diameter of 50 mm, which is aimed for inserting into the narrow space
and exploring deep in the rubble. The ImPACT-ASC also adopts the ciliary vibration
drive proposed for the tube-type ASC [49], which uses a hollow corrugated tube
wrapped with tilted cilia tapes in a spiral shape, and all vibration motors, sensors,
and wires are installed in the tube.
Figure 2.1 is a conceptual diagram of the targeted disaster response mission.
In a rescue operation of collapsed buildings, there is only a small gap that can be
inserted inside the rubble, and victims could be trapped in an open space formed in
the rubble. For example, in our survey of the 2016 Kumamoto earthquake [2], we
observed typical collapsed wooden houses that the upper floor crushed the first floor.
In this case, a small gap is often formed on the side of the collapsed building at the
ground level, but it is difficult for the first responders to enter inside. The ImPACT-
ASC is aimed to approach inside the rubble from such narrow entrance for searching
victims. For efficient exploration, identifying the location and trajectory of the robot
is necessary to capture the situation of the victim and rubble for the first responders.
In the ImPACT-TRC program, we built a simulated collapsed building assuming
such disaster environments in the outdoor field of the Aobayama campus, Tohoku
University in 2016. We evaluated the performance of exploration inside the rubble by
remote control. Advanced technologies developed by ImPACT-ASC are summarized
as follows.
28 M. Konyo et al.
Operator
Active Scope Camera
2F
1F
Air-jet
e! ASC thruster
lp m
He
Victim
Fig. 2.1 Targeted victim search missions for ImPACT-TRC thin serpentine robot platform
Reaction force R
z
Body
Camera
Air
x
y
Nozzle
Air jet
(a) Concept (b) Overview of Floating ASC on rubble
Fig. 2.2 Floating active scope camera by air-jet
I. Air-jet Floating System for Hyper Mobility

In a research mission as shown in Fig. 2.1, it is necessary to insert the robot from the
horizontal direction and overcome rubble by selecting directions. The ImPACT-ASC
introduces a new technology to float the tip of the robot using air injection, and to
overcome steps and gaps, as shown in Fig. 2.2. The levitation function also has an
advantage that the robot can look over the rubble widely because the viewpoint of
the camera at the tip becomes high. We developed an active nozzle with 2 degrees
of freedom and realized stable floating by controlling the pitch and roll angle of
injection. Details are described in Sect. 2.2.
II. Robotic Thruster to Handle Hairy Flexible Cable
In order to insert the long flexible cable of the robot remotely, we also develop an
automatic insertion mechanism as shown in Fig. 2.3. The robotic thruster allows
the operator to command the movement of the tip by air injection and insertion or
retraction of the robot cable simultaneously with a single joystick controller [67].
The thruster also provides the insertion length and rotation angle detected by rotary
Fig. 2.3 The robotic thruster Twisting mechanism Push/Pull mechanism

for the ImPACT-ASC
Push/Pull
Twist
encoders, which are useful for providing reference data to estimate the posture of the
robot and the Visual SLAM, described later.
The most difficult challenge was inserting the hairy body of the long flexible
robot without damaging their cilia. We proposed a special thruster using opposed
flexible rollers whose cylindrical surfaces are covered by tensed flexible wires. The
wires sandwich the robotic body through between the hair to avoid damage. The
thruster can push and twist the ASC and measure the inserted length and twisting
angles. The accuracy of the inserted length is less than 10%. We confirmed that the
thruster was also able to push and twist the ASC even in a three- dimensional rubble
environment [67].
III. Victim Detection Technologies
In complicated rubble environments, it is hard to search for victims by the camera
alone. To detect the existence of victims, which do not appear in the camera, the
auditory information by the mounted microphone and loudspeaker arrays on the robot
is useful and efficient. First responders can call victims through the loudspeaker to
check their existence. We also developed several technologies to emphasize speech
by eliminating sound noise generated by vibration motors. Details are described in
Sect. 2.3.1.
We also developed an image recognition system that automatically recognizes
the material of rubble and detects the similar images with the templates registered
in advance. For example, if we can obtain images of the clothes that the victim is
wearing, the proposed system can automatically detect the similar textures in the
rubble and alarm it for the operator. This technology is described in Sect. 4.3.
30 M. Konyo et al.
Visual SLAM
Tip Camera Image
Posture Estimation
Detection indicators for

audio and image
Fig. 2.4 Operation interface of ASC
IV. Sensory Integrated Operation Interface

To operate the ImPACT-ASC remotely and search for victims trapped in the rubble, a
user interface to integrate visual, auditory and tactile information has been developed.
An example of the screen in operation is shown in Fig. 2.4. The acquired sensory
information is integrated using the robot operating system ROS and visualized on
a screen. The image of the tip camera is shown on the left side, the environmental
structure and trajectory of the robot estimated by Visual SLAM on the upper right,
and the robot posture estimated by the acoustic sensors and the IMUs is shown on the
lower right. The visibility of the structure by Visual SLAM is improved by coloring
the height information. Both images of the Visual SLAM and posture estimation
provide more reliable judgment to understand the situation. The user interface also
has alarms on the lower left when the above-mentioned automatic detection system
recognizes the target images or human voice.
In addition, we developed a tactile detection method to estimate the contact direc-
tions by incorporating multiple vibration sensors and display them on a screen and
vibrotactile stimuli on the joystick controller. Contact information is important for
providing situation awareness for operating in a complicated environment.
2.2 Advanced Mobility by Jet
2.2.1 Introduction
A general problem of the flexible thin serpentine robots is controlling the head motion
in rubble. Head control is necessary to change direction and surmount the rubble.
The camera direction must also be changed to look around in debris. However,
the thin robot body limits the space available for installation of multiple actuators.
In addition, elevating the head of a long flexible robot against gravity is not easy
because of its soft body. For example, although the authors have proposed a tube-
type active scope camera (ASC) that has a head bending mechanism with eight
McKibben actuators [20], the head part often topples to the side when it tries to
elevate the head.
According to this, we propose a new head control method of the thin serpentine
robot by flying the head using an air jet [35]. The air jet elevates the head and allows
the robot to easily control the head direction and avoid obstacles because the head is
in the air. In addition, it enables the robot camera to look around from a higher point
of view. The air jet mechanism has the following advantages for the long serpentine
robot:
• The air jet can directly generate a reaction force on the nozzle, regardless of the
flexible body. Force transmission, which causes the deformation of the soft body,
is not necessary.
• The reaction force only depends on the direction and amount of the flow at the
nozzle outlet. The properties of the environment blown by the air jet do not affect
the force generation if there is a small gap.
• The air jet only requires a nozzle on the head and a tube in the body, which are easily
installed in the thin long body and contribute a simple and lightweight structure.
The primary challenge in realizing head floating and steering of a long serpentine
robot is how to control the reaction force induced by the air jet. For example, if the
head emits the air jet to ground without any control, the head bends backward, and
the control could be lost. However, the reaction force magnitude is not suitable as
a control input because of the delay between the nozzle outlet and the valve input,
which locates on the root of the robot because of the size restriction.
Thus, through the project, we have proposed the concept of controlling the air jet
direction to realize head floating and steering instead of controlling the intensity [29,
35]. This section mainly introduces a mechanical design of the direction controllable
nozzle that can control the air jet directions in pitch and roll axes with a thin structure.
We need to control the air jet direction with respect to the gravity direction at any
head posture to enhance the mobility of the floating head. Thus, the nozzle needs
to control the air jet direction along multiple axes. The major challenges are how
to change the air jet direction without a large resistance to the flow, which causes
a critical pressure drop to reduce the reaction force, and how to rotate the nozzle
connected with an air tube. Ordinarily, a swivel joint is used to change the rotational
direction of a flow channel. However, a swivel joint is not suitable for delivering
large flow rates because it has a small pipe section and causes a large pressure drop.
We propose herein a new approach in designing a nozzle with a flexible flow channel
to change the air jet direction by keeping a large rotational angle.
This paper is outlined as follows: Sect. 2.2.2 introduces the related studies;
Sect. 2.2.3 presents the proposed biaxial active nozzle with a flexible flow chan-
nel composed of a flexible air tube used to change the air jet direction along two axes
32 M. Konyo et al.
Fig. 2.5 Concept of an Head part of continuum robot

active nozzle for the active
Camera Active nozzle
scope camera
Roll
Pitch Air tube

Head
floating
Air jet
with a low-pressure drop; Sect. 2.2.4 validates that the nozzle can change the air jet
direction with a low-pressure drop; Sect. 2.2.5 shows the combination of the biaxial
active nozzle with a 7 m ACS and demonstrates that the robot can look around and
explore the rubble; and the final section, Sect. 2.2.6, introduces an aerial hose-type
robot for fire fighting, which utilizes a similar technology to the proposed nozzle.
2.2.2 Related Research
Methods to obtain a reaction force using an air or water jet were proposed several
decades ago. For example, Xu et al. [66] proposed an air jet actuator system to
investigate the mechanical properties of a human arm joint. The direction of an air
jet on the wrist was switched using a Coanda-effect valve to provide perturbation to
the arm. Mazumdar et al. [43] proposed a compact underwater vehicle controlled by
switching the direction of water jets using a Coanda-effect high-speed valve. Silva
Rico et al. [57] more recently proposed an actuation system based on a water jet.
Three nozzles were mounted at the tip of the robot and connected to three tubes. The
reaction force applied to the tip was controlled by controlling the flow rate in each
flow channel at the root of the tube. Using this method, the robot could control the
direction of the head on the ground and underwater. However, no research has yet
realized the direction control of an air jet on a thin serpentine robot.
2.2.3 Nozzle Design
The air-floating-type ASC targeted herein was a long and thin continuum robot with
an entire length of 7 m and a diameter of approximately 50 mm, including the cilia
on the body surface as Sect. 2.1. The robot was mounted with an active air jet nozzle
at the tip (Fig. 2.5). The active nozzle caused the tip to float by generating a reaction
force with the air jet. A special mechanism was needed to control the air jet direction
in the pitch and roll directions.
This study proposed a method of changing the air jet direction by deforming the
flexible tube connected to the nozzle outlet (hereinafter, the flexible tube connected
to the nozzle outlet is called the nozzle tube). The pressure drop was thought to be low
because the flexible tube smoothly deformed. We proposed herein a mechanism for
Bearing Bearing
(a) rotation along pitch axis (b) rotation along roll axis
Fig. 2.6 A concept of a biaxial active nozzle with a flexible tube. a the air jet direction can be
changed along the pitch axis by tube deformation. b the air jet direction can be rotated along the
roll axis. The nozzle tube outlet can rotate while maintaining its shape because the bearing on the
fixture prevents tube twisting
the biaxial active nozzle (Fig. 2.6). Figure 2.6 presents the definition of coordinates.
The nozzle tube root was smoothly connected to the nozzle outlet. The nozzle tube
tip was fixed to a fixture via a bearing. The fixture can rotate around the roll and
pitch axes while allowing the tube to rotate along the longitudinal direction. When
the fixture rotated along the pitch axis, the nozzle tube deformed (Fig. 2.6), and the
air jet direction changed. In contrast, when the fixture rotated around the roll axis,
the nozzle tube can rotate while maintaining its shape (Fig. 2.6) because the bearing
prevented tube twisting. Therefore, the tube can infinitely rotate along the roll axis,
and the jet direction can be rotated along the roll axis. This principle can be proven
by supposing that the center line of the nozzle tube is a smooth curve.
The air jet reaction force vector f generated by the nozzle can be written as
follows: ⎡ ⎤
cos (π − ψ p )
f = f c ⎣− sin (π − ψ p ) sin ψr ⎦ (2.1)
sin (π − ψ p ) cos ψr
where the attitude of the tip of the nozzle tube is set as the roll angle ψr ; the pitch
angle is ψ p (Fig. 2.6); and f c is the force caused by the momentum and pressure of
the fluid;
f c = ṁu + (Pout − Pair )A (2.2)
where ṁ is the mass flow rate of the air jet from the nozzle tube; u is the flow velocity
at the nozzle tube outlet; Pout and Pair are the pressure at the nozzle tube outlet and
the atmospheric pressure, respectively; and A is the sectional area of the nozzle tube
outlet. Ignoring the fluid force flowing into the nozzle, force f is applied on the
nozzle as the net force. The net force direction can be changed by changing the air
jet direction.
We rotated the fixture around the roll and pitch axes using the differential mecha-
nism. Using the differential mechanism, only the light nozzle tube was placed in the
rotating part, reducing the motor weight. Figure 2.7 shows the differential mecha-
nism. The motor inputs were transmitted to bevel gears 1 and 2. The rotation of bevel
34 M. Konyo et al.
Fig. 2.7 Differential

mechanism to rotate the
fixture along two axes
gear 3 was locked, and the entire mechanism containing the tube tip fixture rotated
around the roll axis when bevel gears 1 and 2 were rotated to the same direction
(input A → output A). Bevel gear 3 rotated, and the nozzle tube tip fixture rotated
around the pitch axis when bevel gears 1 and 2 were rotated to the reverse direction
(input B → output B).
2.2.4 Nozzle Evaluation
2.2.4.1 Biaxial Active Nozzle
We developed a biaxial active nozzle, as in Fig. 2.8, based on the nozzle’s mechanical
design. Using the differential mechanism, the nozzle tube fixture was rotated by the
motors arranged at the front and the rear of the mechanism. The high-pressure air was
sent from the air compressor. The air passed through the air tube, then accelerated to
the sound speed by the nozzle and emitted from the nozzle tube. The inner diameter
of the nozzle tube was 2.5 mm, while that of the air tube passing through the body
was 8 mm. The biaxial active nozzle had an outer diameter of 46 mm, an entire length
of 152 mm, and a whole weight of 70 g.
2.2.4.2 Experiment to Measure Air Jet Reaction Forces
We conducted experiments to confirm whether the air jet direction can change
(whether the direction of the air jet reaction force can change) by the biaxial active
nozzle and whether or not a severe pressure drop occurs when the air jet direction
changes (whether the magnitude of the air jet reaction force is constant). Assum-
ing that the nozzle was mounted at the tip of the flexible long robot, we made
the experimental system as shown in Fig. 2.9. The biaxial nozzle inlet was con-
nected to the air tube (inner diameter: 8 mm, length: 10 m). The air compressor
delivered air through the air tube. The six-axes force sensor (ThinNANO 1.2/1-A,
BL AUTOTEC, LTD.) was attached to the biaxial active nozzle to measure the air
Fig. 2.8 Actual equipment of the biaxial active nozzle
Fig. 2.9 Experimental

system to measure the air jet
reaction force
jet reaction force. The air tube was sufficiently warped to avoid measuring the force
caused by the air tube deformation. During the experiment, the electropneumatic
regulator kept the pressure at the inlet of the air tube at P = 0.54 MPa. The direc-
tion of the nozzle tube outlet was changed around the roll and pitch directions by
commanding the motors in the active nozzle. The commanded pitch and roll angles
were ψ pc = [10π/24, 11π/24, . . . , 18π/24](pitch angle) under the condition of roll
angle ψrc = [−5π/24, −4π/24, . . . , 5π/24] (108 conditions). For each set of angle
position, we measured the reaction force for 500 times in 5 s to calculate the mean
and the standard deviation. The roll and pitch angles of the measured force direction
were calculated based on the coordinate in Fig. 2.6. We assumed backlashes on the
differential mechanism; hence, to measure the attitude of the outlet of the nozzle
36 M. Konyo et al.
Fig. 2.10 Relationship between the nozzle outlet angles (roll and pitch) and the measured force
angles (roll and pitch) under constant pitch and roll angles. a the roll angle of the nozzle outlet and
that of the reaction force corresponds well. b the pitch angle of the nozzle outlet and that of the
reaction force are almost the same although a small difference within 0.17 [rad] is observed
tube, we took photographs of the nozzle tube outlet with a camera placed far away
from the nozzle. The roll and pitch angles of the nozzle tube outlet (ψrm , ψ pm ) were
calculated from the photo.
2.2.4.3 Experiment Results
Figures 2.10 and 2.11 present the experiment results. Figure 2.10a-1, 2, and 3
represent the relationship between the roll angle of the force vector estimated by
the measured nozzle tube outlet attitude ψrm and the roll angle of the measured
force vector when the commanded pitch angles were ψ pc = 5π/12, π/2, and 3π/4,
respectively. Figure 2.10b-1, 2, and 3 represent the relationship between the pitch
angle of the force vector estimated by the measured nozzle tube outlet attitude (ψ pm −
π ) and the pitch angle of the measured force vector when the commanded roll angles
were ψrc = −π/6, 0, and π/6, respectively. Figure 2.11a, b represent the relationship
between the nozzle outlet attitude (pitch and roll angles) and the magnitude of the
measured force when the commanded roll and pitch angles were fixed. Figure 2.10a,
b depict that the direction of the reaction force changed around the two axes. The
force directions also monotonously changed when the attitudes of the nozzle tube
changed. Figure 2.11 shows that the reaction force was almost constant at any roll
and pitch angles, implying that the pressure drop did not occur when the nozzle tube
outlet direction changed.
(a) (b)
Fig. 2.11 Relationship between the nozzle outlet angles (a roll and b pitch) and reaction force mag-
nitude. The reaction force magnitude is almost constant, regardless of the outlet angles, indicating
that the tube deformation does not resist the flow
2.2.4.4 Discussion
The roll angle of the nozzle tube direction and the reaction force vector almost
coincided. In contrast, the pitch angle of the nozzle and the reaction force vector
did not coincide because we ignored the fluid-induced force flowing into the noz-
zle. However, the error value was within 0.17 rad, and the standard deviation was
approximately 0.027. We considered that this error can be affordable for practical
use because we can calibrate the error in advance using the relation in Fig. 2.10.
Meanwhile, the reaction force magnitude was almost constant at any roll and pitch
angles, showing that the proposed biaxial active nozzle was valid.
As for the limitations, the nozzle had a large backlash. In the experiment, the max
error between the commanded pitch angle of the nozzle tube outlet and that estimated
from the photos was approximately π/9 rad because the accuracy and the strength
of the nozzle were not enough due to the nozzle mechanism being made using a
three-dimensional printer. We will solve these problems in the future by choosing a
material that is stiff enough (Fig. 2.12).
2.2.5 Demonstration
2.2.5.1 Integrated Active Scope Camera with Biaxial Active Nozzle
We combined the biaxial active nozzle with the ASC to evaluate the mobility. The
integrated ASC was a long and thin robot with an entire length of approximately 7 m
and an external diameter of approximately 50 mm. Its whole body was covered with
inclined cilia, and vibration motors were arranged in the body at regular intervals.
The robot can move forward through the repeated adhesion and sliding of the cilia
on the ground by vibrating the whole body.
38 M. Konyo et al.
Fig. 2.12 Whole image of

Air compressor
the integrated active scope
camera
Control box
Air jet type Active Scope Camera

Control PC
Camera
Biaxial active nozzle
Fig. 2.13 The integrated Fixed by a pipe

active scope camera looks
around vertically. The
horizontal distance from the
center is approximately
530 mm
2.2.5.2 Ability of Looking Around
We evaluated how much the performance of changing the tip of the ASC has improved
using the proposed biaxial active nozzle. We fixed the cylinder vertically and inserted
the integrated ASC into it from upward, whose tip was 1 m apart from the cylinder.
From this condition, the robot lifted the tip part by emitting an air jet. The air jet
direction was arbitrarily controlled to look around the environment through an oper-
ator. The maximum value supply pressure was set as 0.6 MPa. We monitored the tip
trajectory of the robot using motion capture when the tip oscillation converged.
As a result, the horizontal distance of the tip from the center was approximately
530 mm (Fig. 2.13). The previous ASC, whose head was controlled by McKibben
actuators, realized the horizontal distance of approximately 170 mm [20]. The range,
where the robot can look around, was significantly improved (three times larger).
(a) Swing to left and right (b) Swing to up and down
Fig. 2.14 The integrated active scope camera explores the rubble. The robot can swing left and
right a and up and down b by changing the air jet direction
2.2.5.3 Exploration in the Rubble
We prepared a field simulating a house collapsed by an earthquake to test the

mobility of the integrated ASC. The field was composed of wooden rubble. The
maximum height difference of the field was approximately 200 mm. An operator
commanded the supply pressure of the flow channel and the air jet direction on the
inertia coordinate to control the head direction. The biaxial active nozzle kept the
commanded air jet direction using an installed IMU sensor. The operator operated
the robot looking at only the image of the tip camera.
As shown in Fig. 2.14, the robot explored in the rubble by stably floating its head.
The robot can steer the course by changing the air jet direction. It was also able
to explore the course around 1500 mm distance in approximately 50 s, which the
previous ASC was not able to explore. As a result, the improvement of mobility in
the rubble environment was confirmed.
2.2.6 Other Application as Dragon Firefighter
As one of the other applications of the designed biaxial active nozzle, we introduce
herein the “Dragon Firefighter,” which is an aerial hose-type robot that can fly to the
fire source for extinguishing.
2.2.6.1 Concept
Figure 2.15 shows the conceptual diagram of the robot. The hose-type robot can fly
by expelling the water jet to directly access the fire source on behalf of firefighters and
quickly and safely perform the fire extinguishing task. The features of this robot are
as follows: (1) It has an elongated shape to enter indoors (the hose-type robot has the
advantage of easily accessing narrow and confined spaces). (2) Its nozzle modules
are distributed on the hose, enabling it to fly regardless of the length and control of
40 M. Konyo et al.
Fig. 2.15 Concept of the

Dragon Firefighter
Water jet
its shape. (3) The water jet from the nozzle modules generate enough force to fly the
robot, extinguish a fire, and cool the robot itself. (4) The nozzle module controls the
reaction force by changing the water jet direction, and the water branches from the
main flow channel in the hose. It provides an advantage in that the nozzle does not
require an additional flow channel, which is feasible for a long robot.
2.2.6.2 Nozzle Module Structure
When realizing the concept, the nozzle module design is a critical issue. The nozzle
module is needed to control the magnitude and the direction of the net force induced
by the water jets. However, regulating the amount of flow for controlling the net force
is not feasible because flow regulators are too heavy for installation on the robot.
Even if the amount of flow is controlled at the root of the robot, we need many flow
channels depending on the number of nozzle modules, which is not feasible for a
long hose-type robot.
We propose a nozzle module consisting of multiple biaxial nozzles. Figure 2.16
shows a schematic of the proposed nozzle module. The magnitude and the directions
of the net force can be controlled by controlling the jet directions of the biaxial active
nozzles. Furthermore, in this nozzle module, the original flow path splits into two:
a part can be branched off from the nozzle, and the other transmitted water flows to
forward nozzle modules. Hence, multiple modules can be combined in a daisy chain.
2.2.6.3 Demonstration with a Prototype Robot
The prototype robot was developed as shown in Fig. 2.17 to demonstrate the feasibil-
ity of the concept. The length of the robot was 2 m, and a nozzle module was located
at the head of the robot. The whole weight of the robot with water was approximately
2.5 kg. Figure 2.16b illustrates the developed nozzle module. The two biaxial noz-
zles were controlled by four servo motors. The fixed nozzles were located to gain
the force to float. The root of the robot was connected to the water pump, which
delivered the water to the nozzle module to emit jets.
Reaction force
Biaxial active nozzle
Backside
Flow channel Water

jet
Frontside
Water jet
(a) Nozzle module (b) Developed nozzle module
Fig. 2.16 Concept of the nozzle module and the developed nozzle module
Controller PC
Flowmeter Pressure Gauge
Nozzle module
Water Pump IMU sensors
F P
Valve Water tube

Water Flow Fixed Water jet
Water Flow 2m
7m
Water Tank
(a) System overview
z
x y
(b) Developed prototype
Fig. 2.17 Developed prototype robot and its system. A nozzle module is located on the head of the
robot. The hose-type body is fixed at 2 m from the head at a 1 m height
For the controller, we used a very simple controller for the net force of the nozzle
module f ∈ R 3 as the control input.
f = F − Dd ṙ, (2.3)
where F ∈ R 3 is a constant vector; r ∈ R 3 is the position of the nozzle module; and

matrix Dd = diag[0.5, 0.5, 0.5] is positive-semidefinite. The first term F determines
the hose shape. The second term is a derivative term for the position of the nozzle to
42 M. Konyo et al.
Time 20[s] Time 25[s] Time 30[s]
Fig. 2.18 Flying motion of the prototype. The head flies stably in the air, and the head direction
can be controlled as the control input
Yaw angle of F
80
0
80
-80 Pose of head [deg] 60 Yaw
[deg]
40
1.5 Pitch
20
Position of head [m]
1 x 0
0.5 -20
y Roll
0 -40
0 10 20 30 40 50 60
-0.5 z
-1 Time [s]
0 10 20 30 40 50 60
Time [s]
Fig. 2.19 Time responses of the control input, head position, and head posture. The position and
the pose are measured on the coordinate in Fig. 2.17b. The head position moves left and right (y
axis) as the control input changes. The yaw angle of the pose changes corresponding to the yaw
control input
obtain better stability. The stability of the controller has been discussed in [3]. The
nozzle position was estimated by multiple IMU sensors mounted at 400 mm intervals
on the body of the hose-type robot (Fig. 2.17a).
We conducted the experiment by setting the magnitude of F at approximately 19
N. The yaw direction of F was controlled by a joy stick. The robot was made to
float steadily by spraying water. In addition, the flying direction can be changed to
the left or right by changing the yaw angle of force F. Figures 2.18 and 2.19 show
the flying motion and the head movement, respectively. The robot can stably fly in
the air, and the head direction can be controlled by changing the net force direction.
The top left diagram in Fig. 2.19 displays the time response of the commanded yaw
angle of force over time. The bottom left portion shows the tip position over time.
The right diagram shows the tip posture over time.
2.2.7 Summary
This study proposed a nozzle that can change the direction of air jets around two axes.
A flexible tube was attached to the tip of the rigid nozzle, and the air jet direction
was changed around two axes by deforming and rotating the nozzle tube. The biaxial
active nozzle was allowed to rotate infinitely around the roll axis by eliminating the
tube twist with the bearing at the tip. A mechanical design to realize the biaxial
deformation of the tube was also proposed using a differential mechanism.
As the basic performance of the proposed nozzle, we confirmed that the reaction
force induced by the air jet can be changed around the two axes using this nozzle.
The reaction force magnitude was almost the same, regardless of the air jet direction,
indicating that the pressure drop caused by the tube deformation was not severe.
We mounted the biaxial active nozzle on the head of the ASC. The range, where
the robot can look around in a vertical exploration, became three times larger than
the previous ASC, whose head was controlled by McKibben actuators. We also
confirmed that the robot was able to explore the rubble by floating and steering the
head with the nozzle.
For the other application, we mounted the biaxial active nozzle to the aerial hose-
type robot, Dragon Firefighter. The biaxial active nozzle contributed to the realization
of the design of distributed nozzle modules. The demonstration for the flying motion
showed that the robot can fly in the air and steer the course using the nozzles.
2.3 Auditory Sensing
The audio information is one of the most important clues for a human rescue team
finding victims trapped in a collapsed building. Even a victim who is behind rubble
and cannot be seen by the members of a rescue team can be found if his/her speech
sounds reach beyond the rubble. Rescue activities with an active scope camera (ASC)
will be enhanced by developing and implementing auditory functions that help the
operator of the robot find victims in complicated rubble environments.
In the ImPACT-TRC project, we are developing a speech enhancement system
for an ASC [5, 6, 8, 9]. The ego noise due to the vibration motors and air-jet nozzle
on the robot is much louder than the speech sounds of a victim far from the robot.
A simple approach is to frequently stop the actuators and check for speech sounds.
This approach is unacceptable, however, because it makes it impossible for the robot
operator to search a wide area quickly. To search for victims in loud ego-noise con-
ditions, speech enhancement, which suppresses noise and extracts speech included
in a noisy raw recording, is crucial.
We also have been developing sound-based posture (shape) estimation that local-
izes microphones and loudspeakers put on an ASC [7]. Since the long body of an
ASC is flexible so that it can penetrate into narrow gaps, it is difficult for the operator
to navigate the robot as desired. To quickly manipulate the robot for reaching a target
44 M. Konyo et al.
area, estimating the posture of an unseen ASC is crucial. The posture can be esti-
mated by localizing the multiple sensors distributed on the robot. Acoustic sensors
can be localized by submitting reference sounds from loudspeakers and measuring
the time differences of arrival (TDOAs) of a reference sound that depends on the
sensor locations [45, 53]. The audio information will be helpful not only for the
victim search but also the navigation of a rescue robot.
The sound-based posture estimation is complementary to conventional methods
based on magnetometers [40, 62] or inertial sensors [30]. Magnetometers cannot be
used under rubble or in collapsed buildings because magnetic fields are disturbed
by rubble. The inertial sensors, which consists of accelerometers and gyroscopes,
gradually accumulate the estimation errors because these sensors cannot observe
their current locations. On the other hand, the sound-based method can localize
acoustic sensors even in closed space if the reference sound propagates directly from
a loudspeaker to the microphones. It can also infer the information about current
locations from TDOAs.
The speech enhancement and posture estimation for an ASC are essential not
only for enhancing the operator’s ability to use the robot but also for developing
an intelligent system for the robot. For example, a victim calling for help could be
located by integrating the speech power at microphones and the robot posture, which
are estimated with speech enhancement and posture estimation. Such information
will enable the robot to automatically search for and reach victims trapped under a
collapsed building.
2.3.1 Blind Multichannel Speech Enhancement
Speech enhancement for an ASC has to deal with noise sounds depending on the
surrounding environments because the vibration noise of the robot includes sounds
caused by contacting the ground. In other words, it is difficult to use supervised
speech enhancement by gathering noise sounds in advance. This calls for blind speech
enhancement that bases the extraction of speech signals on statistical assumptions
instead of pre-training data. The speech enhancement for an ASC also has to deal
with the following two technical problems.
Deformable configuration of microphones The relative locations of microphones
change over time as the robot moves.
Partial occlusion of microphones Some of the microphones on the robot are often
covered with rubble or occluded by rubble around the robot. Such occluded micro-
phones fail to capture speech sounds and degrade the enhancement.
These problems make it difficult to exploit existing speech enhancement methods [27,
36, 51].
We have been developing two kinds of blind speech enhancement based on low-
rank and sparse decomposition [5, 8]. Since the noise spectrograms of an ASC have
periodic structures due to vibration, they can be regarded as low-rank spectrograms.
Input spectrogram Low-rank spectrogram Sparse spectrogram
Fig. 2.20 Overview of low-rank and sparse decomposition [5]
Speech spectrograms, on the other hand, have sparse structures and change over
time (are not low-rank). Based on this statistical difference, we can separate speech
and noise signals without pre-training (Fig. 2.20). Since this decomposition is based
on the characteristics of amplitude spectrograms, it is robust against microphone
movements, which mainly affect the phase terms of spectrograms.
2.3.1.1 ORPCA-Based Speech Enhancement
We first developed a blind speech enhancement based on a robust principal com-

ponent analysis (RPCA) that can deal with the dynamic configuration problem [6].
RPCA is the first low-rank and sparse decomposition algorithm proposed by Candes
et al. [12]. To decompose an input matrix X into the low-rank matrix L and sparse
matrix S, it solves the following minimization problem:
arg min L∗ + λS1 s.t. X=L+S (2.4)

L,S
where · ∗ and · 1 represent the nuclear and L1 norms, respectively. In the speech
enhancement scenario, X, L, S ∈ R F×T respectively correspond to the input, noise,
and speech amplitude spectrograms where F and T are the numbers of frequency
bins and time frames. Since the estimated components can take negative values that
are not allowed for amplitude spectrograms, a hinge function f (x) := max(0, x) is
applied to the estimated results so that the components take only nonnegative values.
We used an online extension of RPCA to enhance speech in real time. The online
RPCA (ORPCA) was proposed by Feng et al. [17] and solves the following mini-
mization problem whose cost function is the upper bound of Eq. (2.4):
arg min X − WH − S F + λ1 (W F + H F ) + λ2 S1 (2.5)

W,H,S
where · F is the Frobenius norm and W ∈ R F×K and H ∈ R K ×T represent the

K spectral basis vectors and their temporal activation vectors, respectively. To relax
the complexity of the nuclear norm in Eq. (2.4), the low-rank component L is repre-
sented by the product of W and H, which constrains the rank of L to be K or less.
Equation (2.5) can be solved in an online manner that sequentially inputs one time
frame of X and outputs the corresponding time frames of L and S.
46 M. Konyo et al.
ORPCA
ORPCA
Median
ORPCA
ORPCA
Fig. 2.21 Overview of multichannel combination of ORPCA
To improve the enhancement performance, we combined the ORPCA results over

channels (Fig. 2.21). Since the sparse assumption extracts not only speech signals
but also sparse noise signals (e.g., attack sounds), the enhancement result at each
channel has salt-and-pepper noise. We suppress such noise sounds by taking median
operation over channels as follows:

s f t = median s1 f t , s2 f t , . . . , s M f t (2.6)
where median(. . .) indicates the median operation to its arguments.

As reported in [6], our combination improves the enhancement quality compared
to those of the single-channel ORPCA and conventional methods. In addition, our
method worked in real time with a standard desktop computer that had an Intel Core
i7-4790 CPU (4-core, 3.6 GHz). It was implemented with C++ as a module of an
open source robot audition software called HARK [48]. The elapsed time for our
algorithm with 60 sec of an 8-ch noisy recording was 20.0 s.
The main drawback of this method is that its performance is often severely
degraded when some of the microphones are shaded. Since the median operator
handles all the channels equally, the enhancement is affected by the microphones
whose speech volume is relatively low. It is important to estimate and consider the
volume ratio of speech signals over channels.
2.3.1.2 RNTF-Based Speech Enhancement
To overcome the partial occlusion problem, we developed a multichannel low-rank

and sparse decomposition algorithm called robust nonnegative tensor factorization
(RNTF) [5, 8]. RNTF simultaneously conducts the low-rank and sparse decom-
position and the volume ratio estimation of speech over microphones. This joint
estimation is conducted as a Bayesian inference on a unified probabilistic model of
a multichannel noisy speech signal.
The optimization problem of ORPCA (Eq. (2.5)) can be interpreted as the maxi-
mum a posteriori (MAP) estimation of the following probabilistic model:

xft ∼ N w f k h kt + s f t , 1 (2.7)
k

w f k ∼ N (0, 2λ1 ) h kt ∼ N (0, 2λ1 ) s f t ∼ L 0, λ−1
2 (2.8)

where N (μ, λ) ∝ exp − 21 λ(x − μ)2 represents the Gaussian distribution and
−1
L (μ, b) ∝ exp −b |x − μ| is the Laplace distribution. As mentioned above, the
ORPCA model can take negative values for amplitude spectrograms. This makes it
difficult to formulate low-rank and sparse decomposition for a multichannel audio
input as a unified model. The nonnegative version of RPCA called robust nonnegative
matrix factorization (RNMF) has recently been proposed for audio source separa-
tion [18, 41, 69]. We reformulated RNMF to a Bayesian probabilistic model for
further extensions and developed RNTF that is an extension for multichannel audio
inputs.
As illustrated in Fig. 2.22, RNTF decomposes an input M-channel amplitude
spectrogram xm f t ∈ R+ into channel-wise bases and activations (wm f k , h mkt ∈ R+ ),
a speech spectrogram s f t ∈ R+ , and its gain ratio at each microphone gmt ∈ R+ :
xm f t ≈ wm f k h mkt + gmt s f t . (2.9)

k
Since the input spectrogram is nonnegative, we put a Poisson distribution on xm f t

instead of the Gaussian distribution as follows:

xm f t |W, H, G, S ∼ P wm f k h mkt + gmt sm f t (2.10)
k
Fig. 2.22 Overview of robust nonnegative tensor factorization (RNTF) [8]

48 M. Konyo et al.
where P(λ) ∝ λx exp(−λ) is the Poisson distribution. Note that the Poisson distri-
bution is put on a continuous random variable xm f t and this continuous likelihood
is an improper distribution. This likelihood is widely used in audio source separa-
tion because its maximum likelihood estimation corresponds to the minimization of
Kullback–Leibler divergence [28].
The speech and noise spectrograms are characterized by putting the specific prior
distributions on the latent variables. The latent variables of noise wm f k and h mkt
follows a gamma distribution, which is a distribution of nonnegative real values and
the conjugate prior of the Poisson distribution:
wm f k ∼ G (a w , bw ) h mkt ∼ G (a h , bh ) (2.11)
where G (a, b) ∝ x a−1 exp(−bx) represents the gamma distribution and a w , bw , a h ,

and bh are the hyperparameters. The speech gain gmt follows a gamma distribution
whose expectation is 1:
gmt ∼ G (a g , a g ) (2.12)
where a g represents a hyperparameter that controls the variance of the gain parameter.
The speech spectrogram s f t follows a prior distribution that consists of the following
gamma and Jeffreys priors:
s f t ∼ G (a s , β f t ) β f t ∼ p(β f t ) ∝ β −1
ft (2.13)
The Jeffreys prior, which is one of the non-informative priors, is put on each time-
frequency (TF) bin of the scale parameter β f t . The estimation of β f t enables the
significance of each TF bin be estimated automatically as in Bayesian RPCA [4],
which leads the speech spectrogram s f t being sparse.
The decomposition of an input spectrogram is conducted by estimating the pos-
terior distribution of the RNTF model p(W, H, G, S, β|X). Since it is hard to derive
the posterior distribution analytically, it is estimated approximately by applying the
variational Bayesian inference [10]. As reported in [5], our method kept its robustness
even when half of the microphones are shaded by an obstacle. RNTF also worked in
a highly reverberant environment where the reverberant time RT60 was 800 ms. This
is because the late reverberation can be separated into low-rank components.
Since the RNTF was an offline algorithm, we extended it as a state-space model
called streaming RNTF [5] that can enhance speech in a mini-batch manner. The
experimental results showed that the enhancement performance of the streaming
RNTF was similar to that of the offline RNTF when the mini-batch size was more
than 2.0 s (200 frames). The streaming RNTF was implemented on the NVIDIA
Jetson TX1 (Fig. 2.23), which is a mobile GPGPU board, and enhanced speech in
real-time.
Fig. 2.23 NVIDIA Jetson TX1 used for real-time speech enhancement
2.3.1.3 Future Extension
In this project, we have been developing blind speech enhancement based on low-
rank and sparse decomposition by focusing on the vibration noise of the ASC. We
are currently developing speech enhancement for air-jet noise. The air-jet noise has
energy higher than the vibration noise does and sometimes becomes non-stationary.
One good characteristic of this noise is that the noise may be independent of the
environment because the dominant component of the noise is caused from the air-jet
itself. This enables us to use (semi-)supervised enhancement that is pre-trained with
noise sounds. RNTF can be extended to a semi-supervised enhancement algorithm
by modifying the prior distribution of the basis vectors (Eq. (2.11)). This extension
would be able to suppress the unknown noise because the noise bases are not fixed
as a prior distribution and can be adapted to the observation.
2.3.2 Microphone-Accelerometer-Based Posture Estimation
The main challenge of posture estimation for an ASC is to estimate the posture
robustly even in rubble-containing environments. Reliable audio measurements can
be obtained only when the reference sound emitted from a loudspeaker propagates
to the microphone directly or almost directly. It is crucial to use different kinds
of sensors that have different characteristics and to determine whether the values
observed with each are reliable.
50 M. Konyo et al.
2.3.2.1 Feature Extraction from Acoustic Measurements
We use the TDOAs of a reference sound to estimate the posture of an ASC. When
the sth loudspeaker playbacks a reference sound for the kth measurement, TDOA
between microphones m 1 and m 2 (the onset time difference of the reference sound
between microphones m 1 and m 2 ) is denoted by τm 1 →m 2 ,s,k and is defined as follows:
τm 1 →m 2 ,s,k = tm 2 ,s,k − tm 1 ,s,k (2.14)
where tm,s,k represents the onset time of the reference sound. The TDOA estimation
has to be robust against the following three characteristics [7]:
Ego noise and external noise Both the ego noise (e.g, vibration noise) and external
noise (e.g., engine sounds from other machines) contaminate the recordings of
reference sounds. We have to distinguish the reference sound from the noise
sounds.
Reverberations and reflections In the closed space where an ASC robot is used, the
reverberations and reflections of a reference sound also contaminate the recorded
signal. We have to detect the onset time of the reference sound that arrived directly
rather than the onset times of the reference sound’s reverberations or reflections.
Partial occlusion of microphones When there are obstacles between microphones
and a loudspeaker, the TDOAs are different from those in an open space. Since it
is difficult to formulate the propagation path of a diffracted sound, such a sound
should be distinguished from the direct sound.
Solution to ego noise and external noise
For robustness against noise sounds, we use a time-stretched pulse (TSP) [60] as
a reference sound. The TSP signal is defined in the frequency domain as follows:
2
TSP(ω) = e j4παω (2.15)
where ω denotes the frequency bin index. The α = 1, 2, 3, . . . controls the duration
that the reference sound exists. As shown in Fig. 2.24, the TSP signal is a sine wave
swept from the Nyquist frequency to 0 Hz. Since this signal contains all frequency
components, we can obtain a sharp peak corresponding to the onset time of the
reference sound by convoluting the reference signal and noisy recording. By detecting
this sharp peak, the onset time detection can have the robustness against noise sounds
and can have high time resolution.
Solution to reverberations and reflections
We use the generalized cross-correlation with phase transform (GCC-PHAT) [38,
68], which is known as an onset time detection robust against reverberation. Let
x(ω) and y(ω) be the reference signal and its recording in the frequency domain,
respectively. The GCC-PHAT outputs the likelihood of the onset time of x as follows:
Fig. 2.24 The time domain and time-frequency domain representations of a TSP signal
y(ω)x(ω)∗
GCC-PHAT(t|x, y) = iFFTt (2.16)
|y(ω)x(ω)∗ |
where iFFTt [ f (ω)] represents the result of the inverse Fourier transform of a spec-
trum f (ω) at time t.
The reflections are tackled by picking the first peak of the GCC-PHAT coefficient
(Eq. (2.16)). The sound that arrived directly is measured earlier than its reflections
because the propagation path of a reflection is longer than that of the direct sound.
The direct sound can therefore be distinguished from its reflections.
The occlusion of microphones is dealt with an outlier detection. The TDOA
|τm 1 →m 2 ,s,k | is less than or equal to the time of flight (ToF) for the body length
between the corresponding microphones on an ASC robot. The TDOA of a sound
diffracted by obstacles, on the other hand, can be larger than this ToF. We exclude
such outliers and estimate the posture with the reliable TDOA measurements.
2.3.2.2 Online Inference with a State-Space Model of a Flexible Cable
We developed a 3D posture estimation method that combines the TDOAs obtained

from the acoustic sensors and tilt angles obtained from accelerometers [7]. The
audio sensors can correct the information about the current sensor locations but they
deteriorate in rubble-containing environments. The accelerometers are affected little
if at all by the surrounding environments, but it is difficult to estimate the robot
posture by using only the accelerometers. We combined these two kinds of sensors
to robustly estimate the robot posture even in a rubble-containing environment. The
estimation of a robot posture is based on a state-space model that represents the
dynamics of a robot posture and the relationship between the measurements and
posture.
52 M. Konyo et al.
Fig. 2.25 Piece-wise linear

representation of a posture of
an ASC [7]
As shown in Fig. 2.25, the posture is approximated by a piece-wise linear model

whose joints represent sensor locations. The latent variable that represents the robot
posture z k consists of the link angles θi,k , φi,k (i = 1, . . . , M + N ) and the link
lengths li,k (i = 1, . . . , M + N − 1):
T
z k = θ1,k , . . . , θ N +M,k , φ1,k , . . . , φ N +M,k , l1,k , . . . , l N +M−1,k (2.17)
where M and N represent the number of microphones and loudspeakers, respec-

tively. Note that in this formulation each accelerometer is at the same location as a
microphone.
The robot posture is updated with a state update model p(z k |z k−1 ) that consists of
two sub-models: posture dynamics q(z k |z k−1 ) and a prior of the robot posture r (z k ).
These two sub-models are integrated as a product of expert (PoE) [26]:
z k ∼ p(z k |z k−1 ) ∝ q(z k |z k−1 )r (z k ). (2.18)
The posture dynamics q(z k |z k−1 ) is formulated as a random walk as follows:
q(z k |z k−1 ) = N (z k−1 , Rz ) (2.19)
where Rz represents a covariance matrix of the random walk. On the other hand, the
prior r (z k ) is formulated as a Gaussian distribution as follows:
r (z k ) = N (μ0 , R0 ) (2.20)
where μ0 and R0 are model parameters that represents a feasible posture. We set
these values so that the angles θi,k and φi,k tend to be 0 and li,k tends to be the robot
length between the modules on the robot.
The measurement models for TDOAs and tilt angles are separately formulated
as follows. The measurement model for TDOAs is formulated with the distance
difference between a loudspeaker and each of two microphones:

|x m 2 ,k − x s,k | − |x m 1 ,k − x s,k | 2
τm 1 →m 2 ,s,k ∼ N , στ (2.21)
C
where x m,k and x s,k are the locations of mth microphone and sth loudspeaker. C
is the speed of sound in air, and στ2 represents the model parameter controlling
the variance of τm 1 →m 2 ,s,k . The tilt angle am,k obtained by the mth accelerometer
is formulated as follows:
Fig. 2.26 Graphical representation of the state-space model for posture estimation
0.0
-0.2
-0.4
Z [m]
-0.6
-0.8
-1.0
-0.6 -0.4 0.2
-0.2 -0.2 0.0
X [m] 0.0 0.2 -0.6 -0.4 Y [m]
Ground-truth posture Front Left Right

(a) Condition 1: open space
0.0
-0.2
-0.4
Z [m]
-0.6
-0.8
-1.0
-0.6 -0.4 0.2
-0.2 -0.2 0.0
X [m] 0.0 0.2 -0.6 -0.4 Y [m]
Ground-truth posture Front Back Left Right

(b) Condition 2: sticks
0.0
-0.2
-0.4
Z [m]
-0.6
-0.8
-1.0
-0.6 -0.4 0.2
-0.2 -0.2 0.0
X [m] 0.0 0.2 -0.6 -0.4 Y [m]
Ground-truth posture Front Back Left Right

(c) Condition 3: sticks and plate
Fig. 2.27 Experimental conditions for evaluating microphone-accelerometer based posture esti-
mation [7]. Red lines indicate the ground truth posture obtained by using a motion capture system
54 M. Konyo et al.
Fig. 2.28 Results of posture estimation [7]. The gray and black lines represent the ground-truth
postures and initial-states of the postures, respectively
2m−2 2m−1
1 1
am,k ∼ N φi,k + φi,k , σa2 (2.22)
2 i=1
2 i=1
where σa2 represents the model parameter controlling the variance of am,k .
The robot posture z k is estimated from the TDOAs τ1:k and tilt angles a1:k by
inferring the posterior distribution of our state-space model (Fig. 2.26). Since the
state-space model has non-linearity in Eq. (2.21), we approximately estimate the pos-
terior distribution of the posture p(z k |τ1:k , a1:k ) as a Gaussian distribution by using
the unscented transform [33, 63]. As reported in [7], we evaluated our method in
three environments by using a 2.8-m ASC (Fig. 2.27). In Fig. 2.28, the microphone-
accelerometer-based posture estimation (proposed) is compared with the baseline
method that uses only acoustic sensors. The baseline method failed to estimate pos-
tures in the second and third conditions, where there were some obstacles. The
microphone-accelerometer-based estimation, on the other hand, robustly estimated
the robot posture in all the conditions. This posture estimation worked in real time
with a standard laptop computer that had an Intel Core i7-3517U processor (2-core,
1.9 GHz).
2.3.2.3 Future Extension
We are currently developing multi-modal posture estimation that combines not

only acoustic sensors and accelerometers but also gyroscopes. As revealed in [7],
the sound-based posture estimation cannot distinguish mirror symmetrical postures
because the acoustic sensors are equipped in a one row on the robot. The gyroscopes
can roughly estimate the posture of an ASC although they accumulate their errors
over time. To complementarily solve these problems, we combine the acoustic sen-
sors and gyroscopes so that the mirror symmetrical problem is solved by gyroscopes
and the accumulative error problem is solved by the acoustic sensors.
2.4 Visual SLAM for Confined Environments
2.4.1 Background and Motivation
2.4.1.1 Utility of Visualization of 3D Environmental Structure
ASCs enable us to explore the inside of a confined environment safely from the
outside. Operators can visually inspect the inside using video captured by a built-in
camera at their front-end. However, it is not an easy task to obtain the concept of the
global 3D structure of the environment by watching the video. An image sequence
captured in a short period of time covers only small areas of the environment, from
which the operator can hardly grasp the global 3D structure. The operator can obtain
its rough concept at the best, which requires to keep watching the video from the
beginning of the operation. It is also hard to share the concept with others.
These issues will be resolved by visualization of the internal 3D structure of
the environment. Towards this end, we consider the employment of visual SLAM
(simultaneous localization and mapping), the method for estimating the structure of
a scene as well as camera motion from its video captured by a camera moving in the
scene. Visual SLAM has a long history of research, and several practical methods
have been developed, such as feature-based methods (e.g., PTAM [37], ORB-SLAM
[46, 47]) and direct method (e.g., DTAM [50], LSD [16], DSO [15]). Considering
limitations of the payload etc., visual SLAM that needs only a single camera is one
of a few practical solutions.
2.4.1.2 Difficulties with Application of Visual SLAM to ASCs
ASCs are typically operated in a closed and confined space. This often makes the
distance from the camera to environmental surfaces shorter than in an open, non-
confined space. As a result, the captured video images tend to change rapidly and
drastically between consecutive time frames in the video. Even if the camera moves
56 M. Konyo et al.
at the same speed, closer distance to environmental surfaces will bring about larger
temporal image changes. Moreover, the head of an ASC often undergoes an abrupt
motion, for example, when its head moves along on a planar surface and then suddenly
falls off of it, or when the ASC takes off a ground surface with air propulsion into
air and then hits a wall or ceiling.
These rapid, large image changes pose a serious difficulty for visual SLAM.
The processing pipeline of visual SLAM starts with matching features (e.g. points)
across different image frames in the video. Its success first depends on this initial
step. Note that slow and continuous changes of images enable steady tracking of
the same landmarks across consecutive images. On the other hand, rapid and large
image changes make this step fundamentally difficult, resulting failure of the entire
pipeline.
Other difficulties for visual SLAM are motion blur and image distortion caused
by a rolling shutter (RS). Although these are common in other applications, they
are more critical for ASCs. As explained above, ASCs often come close to the
environment surfaces, and then image motion tends to become relatively fast for the
velocity of translational motion of the camera. Moreover, ASCs are often used in
low-light environment. Then, we may not be able to choose fast shutter speed, and
thus motion blur will be more likely to occur. RS distortion emerges when a camera
with a rolling shutter undergoes rapid motion during frame capture. Its effect tends
to be larger particularly for ASCs due to a similar reason to above. Although global-
shutter cameras with good quality are available recently, rolling-shutter cameras are
superior in terms of image quality in low-light environments due to their physical
design. Thus, when we employ rolling-shutter cameras, it is important to properly
deal with RS distortion. In what follows, we will first consider RS distortion and
then difficulty with tracking features across video frames due to rapid, large image
changes.
2.4.2 Rolling Shutter Distortion
2.4.2.1 RS Distortion Correction as Self-Calibration of a Camera
Model of RS Distortion
Let X ≡ [X, Y, Z ] and x ≡ [x, y, z] denote the world and camera coordinates,
respectively. The coordinate transformation between these two is given by
x = R(X − p), (2.23)
where R is a rotation matrix and p is a 3-vector (the world coordinates of camera

position). Assuming constant camera motion during frame capture, rolling shutter
(RS) distortion is modeled by
⎡ ⎤
c
⎣r ⎦ ∝ R(r φ)R{X − (p + r v)}, (2.24)
1
where c and r are column (x) and row (y) coordinates, respectively; the shutter is
closed in the ascending order of r ; R(r φ) and v represent the rotation matrix and
translation vector of the camera motion; φ = [φ1 , φ2 , φ3 ] is the axis and angle
representation of the rotation. Note that R and p are the camera pose when the
shutter closes at r = 0. Assuming the angle of φ to be small, we approximate (2.24)
as follows: ⎡ ⎤
c
⎣r ⎦ ∝ (I + r [φ]× )R{X − (p + r v)}, (2.25)
1
where ⎡ ⎤
0 −φ3 φ2
[φ]× = ⎣ φ3 0 −φ1 ⎦ . (2.26)
−φ2 φ1 0
Approximating RS Distortion by an Imaginary Camera
It is shown in [32] that under some conditions, RS distortion can be approximated

by an “imaginary” camera. Setting v = 0 and using (2.23) with x ≡ [x, y, z] , (2.25)
can be rewritten as follows:
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
c x x
⎣r ⎦ ∝ (I + r [φ]× ) ⎣ y ⎦ ∝ (I + r [φ]× ) ⎣ y ⎦ , (2.27)
1 z 1
where x ≡ x/z and y ≡ y/z. Now, when φ1 , φ2 , and φ3 are small, f : [x , y ] →

[c, r ] can be approximated as
f ≈ f p ◦ fd , (2.28)
where f d : [x , y ] → [x , y ] is defined as
x = x − φ3 y 2 , (2.29a)
y = y + φ3 x y , (2.29b)
and f p : [x , y ] → [c, r ] as
⎡ ⎤ ⎡ ⎤⎡ ⎤
c 1 φ2 0 x
⎣r ⎦ ∝ ⎣0 1 − φ1 0⎦ ⎣ y ⎦ . (2.30)
1 0 0 1 1
58 M. Konyo et al.
Assume the RS camera model given by Eqs. (2.28)–(2.30) with unknown motion
parameters for each image. Then, the problem of reconstructing the pose and RS
parameters of the camera at each pose as well as scene points is equivalent to self-
calibration of an imaginary camera that has unknown, varying skew φ2 and aspect
ratio 1 − φ1 along with varying lens distortion given by (2.29).
Self-calibration and Critical Motion Sequences
Self-calibration is to estimate (partially) unknown internal parameters of cameras

from only images of a scene and thereby obtain metric 3D reconstruction of the
scene. It was extensively studied from 1990s to early 2000s [21, 23–25, 42, 55]. A
critical motion sequence (CMS) is a set of camera poses for which the problem of
self-calibration becomes degenerate. As in previous studies, each of which considers
a specific problem setting such as the case of constant, unknown internal parameters
[59] and the case of zero skew and known aspect ratio [34] we derive an intuitive
CMS for the above formulation, which coincides with the one given in [1]:
• All images are captured by cameras having the parallel y axis. This camera motion
is a CMS. The translational components may be arbitrary.
An example of this CMS is the case where one acquires images while holding a
camera horizontally with zero elevation angle. This style of capturing images is
quite common. Even if a camera motion does not exactly match this CMS, if it is
somewhat close to it, estimation accuracy could deteriorate; see [1] for more detailed
discussions.
There are several ways to cope with this CMS. We propose to find an image i in
the sequence that undergoes no RS distortion (or as small distortion as possible), for
which we set φ1i = 0 and/or φ2i = 0.
The above RS model can be incorporated into the visual SLAM pipeline. We
assume the (genuine) internal parameters of each camera to be known. Thus,
unknowns per each camera are the six parameters of camera pose plus the three
RS parameters [φ1 , φ2 , φ3 ]. They can be estimated by performing bundle adjustment
over these parameters for all the cameras. The initial values of the RS parameters are
set to 0, since their true values should be small due to the nature of RS distortion.
The initial values of camera poses and point cloud are obtained by using these initial
values for the RS parameters.
2.4.2.2 Experimental Results
We conduct experiments to evaluate the proposed method. In the experiment, we use

synthetic data to enable accurate comparison with ground truths. To obtain realistic
data, we perform the Structure from Motion (SfM) [64, 65] on real image sequences
from a public dataset [58] to obtain point cloud and camera poses. We then generate
images from them using the RS camera model (2.24). The camera motion inducing
RS distortion was randomly generated for each image. To be specific, for the rotation
R(r φ), the axis φ/|φ| was generated in a full sphere with a uniform distribution and
the angular velocity |φ| was set so that (rmax − rmin )|φ| equals to a random variable
generated from a Gaussian distribution N (0, σr2ot ), where rmax and rmin are the top and
bottom rows of images. For the translation, each of its three elements was generated
according to N (0, (σtrans t¯)2 ), where t¯ is the average distance between consecutive
camera positions in the sequence. We set σr ot = 0.05 rad and σtrans = 0.05. We
used the same internal camera parameters as the original reconstruction. We added
Gaussian noises ε, ε ∼ N (0, 0.52 ) to the x and y coordinates of each image point.
We applied four methods to the data thus generated. We run each method for 100
trials for each image sequence. In each trial, we regenerated the additive image noises
and initial values for bundle adjustment. RS distortion for each image (except the first
image) of each sequence was randomly generated once and fixed throughout the trials.
We intentionally gave no distortion to the first image. The first compared method
is ordinary bundle adjustment without any RS camera model (referred to as “w/o
RS”). The second and third ones are bundle adjustment incorporating the proposed
RS camera model. The second one is to optimize all the RS parameters equally in
bundle adjustment (“w/ RS”). The third one is to set φ11 = 0 and optimize all others
(“w/ RS*”). This implements the proposed approach of resolving the CMS mentioned
earlier. The last one is BA incorporating the exact RS model with linearized rotation
(2.27) (“w/ RS(r [φ]× ])”).
Figure 2.29 shows the results, i.e. cumulative error histograms of translation of
cameras and of structure (scene points). To eliminate scaling ambiguity for the evalu-
ation of translation and structure, we apply a similarity transformation to them so that
the camera trajectory be maximally close to the true one in the least-squares sense.
Then the translation error is measured by the average of differences between the true
camera position pi and the estimated one p̂i over the viewpoints. The structure error
is measured by the sum of distances between true points and their estimated coun-
terparts. Figure 2.30 shows that the method “w/ RS*” that fixes φ11 shows the best
performance, which confirms the effectiveness of our approach. We also show typical
reconstruction results. It is seen that the method “w/ RS*,” yields the most accurate
camera path and point cloud than others, which agrees with the above observations.
100 100
80 80
60 60
count
count
40 w/o RS 40 w/o RS
w/ RS w/ RS
*
20 w/ RS 20 w/ RS *
) )
x x
0 0
0 0.05 0.1 0.15 0.2 0 1000 2000 3000 4000 5000
translation error structure error
Fig. 2.29 Results. Cumulative histogram of errors of estimated translation components of camera
poses and of estimated 3D points. “w/o RS” is ordinary BA without the proposed RS camera model.
“w/ RS” and “w/ RS*” are BA incorporating the proposed RS model; φ11 is fixed to 0 for the latter.
“w/ RS(r [φ]× ])” is BA incorporating the original RS model (2.27)
60 M. Konyo et al.
Fig. 2.30 Typical reconstruction results for sequence fountain-P11. a w/o RS. b w/ RS. c w/
RS* (φ11 fixed). Grey dots and lines with crosses are true scene points and true camera positions,
respectively
2.4.3 Adaptive Selection of Image Frames
2.4.3.1 Outline of Our Approach
A simple solution to the aforementioned difficulty with rapid, large image changes
is to use fast frame-rate cameras. There are a growing number of cameras available
in the market that can capture images at from 100 to 300 frames per second and
meet other requirements, e.g., high sensitivity to low light, which is essential to work
with shorter exposure time due to faster frame rate, and a small size body, which
is necessary to built the camera in the small-size head part of ASCs. Such cameras
provide images that change only slowly between consecutive time frames, which
contributes to mitigate the issue with tracking features in video images.
The actual bottleneck is rather the processing speed of visual SLAM. It is from 10
to 50 frames per second for the recent visual SLAM systems, assuming the use of a
high-end desktop PC with multi-core CPU and with or without a GPU. Therefore, it
is impossible to use all the images obtained at the faster frame rate beyond the range
of processing speed. Thus, we relax the requirements of “real time” processing. The
purpose of using visual SLAM for an ASC is to provide the environmental map
and camera poses for better maneuver of the ASC in confined environments. Since
we do not consider more time-critical applications such as motion control, a certain
amount of time delay may be allowed (but it should be in the range making the above
application possible).
Considering these limitations and requirements, we employ a method that selects

some of high-frame-rate images for visual SLAM in an reactive manner, thereby
improving the robustness without sacrificing the real time processing in the above
sense. The underlying idea comes from our observation that high-frame-rate images
are not always necessary; they are only necessary in relatively rare occasions, for
instance, a sudden, excessive approach to environment surfaces, or an abrupt camera
motion due to physical contact with environmental structure. Thus, we decimate
image frames and feed them to the visual SLAM pipeline while it runs smoothly,
and use more images in emergency when it fails. Thereby we can find a good balance
between robustness and real-time processing.
2.4.3.2 Algorithm for Selecting Image Frames
A key to success of the aforementioned approach is how to change the number of

image frames to input to the visual SLAM pipeline based on judgment of its success
or failure. It is ideal to be able to predict a possible failure in the near future, thereby
increasing them. However, our preliminary test has revealed that this approach is
hard to do in practice. Therefore, we instead employ a simple approach to “rewind”
the video images, whenever we encounter a failure of visual SLAM. To be specific,
we go back to the last frame at which visual SLAM was stably performed. In normal
situations, we decimate the high-frame-rate images by a pre-determined rate (e.g.,
15 frames per second) that is lower than the maximum frame rate of the base visual
SLAM pipeline.
More specifically, while visual SLAM is working successfully, we decimate the
incoming image frames by ratio n s : 1. Every n s th frame is chosen and inputted to
the visual SLAM pipeline. All other frames are not used but saved in a ring buffer
for possible use in the future. If the visual SLAM pipeline fails at the nth frame,
then we immediately go back to the (n − n s )th frame, which is the last frame where
visual SLAM succeeded. We then try to use (n − n s + n r )th frame instead of the
nth frame. If this works, then we return to the above state of using every n s th frame
counted from the current one. If this fails, then we declare failure of tracking. We set
n s = 10 and n r = 1 in the experiments and field test explained in what follows.
The idea behind this algorithm is as follows. Firstly, a failure could occur all of
sudden, and it is important to be able to deal with this sudden failure. Thus, the above
algorithm immediately goes back to the last successful frame when it encounters
failure. Secondly, there are cases where images are captured under bad conditions
for a short period of time, which makes visual SLAM fail. Examples are motion
blur caused by an instant rapid motion and an abrupt change in illumination. In such
cases, it may be only necessary to discard the affected frame(s), which then enables
to continue the visual SLAM pipeline. The above algorithm can do this in practice by
discarding the nth frame causing failure and continuing with (n + n r )th frame (after
processing the (n − n s ) frame). It should be noted that the above algorithm always
tries to return to the normal state of decimating every n s th frame. This is clearly not
62 M. Konyo et al.
efficient for the cases where bad imaging condition persists for a certain period of
time and thus every frame needs to be processed. We choose to prioritize the above
case where bad condition continues only for a short period of time.
2.4.3.3 Implementation Details
Our method for selecting images can be integrated with any visual SLAM systems.
In what follows, we explain integration with ORB-SLAM2 [47]. We implement
the integrated system in a single PC with multi-core CPU, where multithreading is
maximally utilized for performance optimization. We use the first thread for capturing
high-frame-rate images from the ASC camera, which is connected to the PC using a
USB3 cable. Each captured image is stored in a ring buffer after application of simple
image preprocessing (e.g., resize, contrast adjustment etc.). In the same thread, we
also select images for the purpose of display and send them over a ROS network to
another PC offering a UI for the entire ASC system. These images are mechanically
selected at 30 frames per second. We use the second thread to retrieve an image from
the ring buffer and input it to the visual SLAM pipeline. The image is received by
ORB-SLAM, which runs in different multiple threads. It computes the current pose
of the camera and 3D point cloud, which are sent over the ROS network to the PC
for the UI.
Our code running in the second thread monitors the success/failure of the visual
SLAM pipeline. The decision is made by checking if the number of tracked feature
points in the frame is more than 30. If it detects a failure, we eliminate the information
about the last frame and go back to the last successful frame, inputting the next frame
stored in the ring buffer to ORB SLAM. If this does not recover the failure, we make
the visual SLAM system restart the processing pipeline; in the case of ORB-SLAM,
we invoke the “re-localization mode” to try to localize the camera in the map that
has been constructed by then. If this re-localization succeeds, then we can restart the
visual SLAM pipeline from the re-localized camera pose.
We conducted experiments to evaluate the effectiveness of the proposed method. We

compare the base visual SLAM system and the proposed system, i.e., the integration
of the same base system with the above method for recovering from failures. In the
comparison, the performance of each system is evaluated by the number of success-
fully processed image frames. We ran each system for 100 trials. The base system
(and thus the proposed one) shows probabilistic behaviours due to the randomness
in RANSAC employed in the pipeline. Figure 2.31 shows examples of an image
sequence used in the experiments. Figure 2.32 shows the results. It is seen that the
proposed system has processed a larger number of image frames successfully. It is
also observed that the baseline system often fails at from 2001th to 3000th frames; the
proposed system survives this difficult section more often than the baseline, which is
Fig. 2.31 Example image frames contained in the video used for evaluation of the frame selection
algorithm
Fig. 2.32 Counts of trials 100

for which the visual SLAM w/ frame select
pipeline succeeded at each 80 w/o frame select
range of frame indexes
60
count
40
20
0
0-1 1-2 2-3 3-4 4-5 5-6 6-7 7-8 8-9 ~end
frame index (x103)
due to the proposed failure recovery mechanism. The baseline system successfully
processed all the image frames and thus arrived at the end of the image sequence for
30 out of 100 trials, whereas the proposed system did so for about 60 trials. We can
conclude from these that the proposed method is effective in improving robustness
of visual SLAM.
2.4.4 Field Test
We tested the proposed method in a field test for evaluating comprehensive per-
formance of ASCs. The tested ASC system is equipped with a Sentech STC-
MCS241U3V camera and a CBC 3828KRW fisheye lens. The camera can captures
images of 1920 × 1200 pixels at 160 frames per second. The ASC was inserted into
the mock-up of a collapsed house from a hole on a wall of its 2nd floor, exploring its
inside by first going straight on the 2nd floor and then going down to the 1st floor.
Figure 2.33 shows example images captured by the ASC at four different time
points during the exploration. Figure 2.34 shows the 3D reconstruction obtained by
the proposed system at the same four time points. The connected lines in red color
indicate the estimated trajectory of the camera, and the set of points in various colors
are the estimated scene points. The latest pose of the camera is represented by the
tri-colored coordinate axes. We can confirm from these figures that the ASC first
explored the 2nd floor of the mock-up and then moved down to the 1st floor for
further exploration.
64 M. Konyo et al.
Fig. 2.33 Example images captured by the ASC exploring inside a mock-up of a collapsed house
Fig. 2.34 Examples of 3D reconstruction by the proposed visual SLAM system. Camera trajectories
are shown in red lines
We repeated this exploration test several times. The recovery procedure of our
frame selection method was executed dozen of times per each exploration on average.
Thus, it did contribute to robust and smooth operation of visual SLAM despite that the
ASC underwent multiple abrupt, large motion. However, there existed a few complete
failures per exploration. Even in that case, it was made possible to continue visual
SLAM by “re-localizing” the camera in the map that had been constructed by then.
Although the success rate of the re-localization procedure was 100%, it requires
special maneuver of the ASC. These failures may be attributable to excessively fast
motion of the ASC head part beyond the coverage of the frame rate at 160 fps, making
it impossible to keep sufficient overlap between consecutive frames; or to motion
blur generated by similarly faster motion than the exposure time of the camera, which
was set in the range from 1 to 3 ms.
2.4.5 Summary
In this section, we describe application of visual SLAM to ASCs exploring confined

environments to visually inspect their inside. We have shown solutions to two major
difficulties with this application, i.e., RS distortion and rapid, large image changes.
Our solution to the latter is to choose some of a large number of images captured by
a high-frame-rate camera in an adaptive manner. This method finds a good balance
between robustness and real-time processing, which enables the operator of the ASC
to utilize the results of visual SLAM for better maneuver of the ASC and exploration
into confined environments. Several experiments including field tests that were con-
ducted using a mock-up of a collapsed house show the effectiveness of our methods;
our system provides accurate estimates of the 3D structure of the environment and
the trajectory of the ASC exploring it.
2.5 Tactile Sensing
2.5.1 Introduction
An active scope camera (ASC) with the ciliary vibration drive mechanism [19, 22,
39, 49] is lacking of sensing capability to provide the contact information with
surrounding environments to an operator. An operator could hardly perceive contact
situations by only monitoring the video image of the head camera. Although the
snake-like robots require tactile sensors on their body, they do not have enough
space to install many sensors. Furthermore, the poor durability of tactile sensors in
case of collision and abrasion is another problem with regard to mounting them on
the surface of the robot.
This section shows an approach to providing contact information to an operator of
the ASC using a simple configuration of vibration sensors and a vibrator. Figure 2.35
Vibration
sensors
Color indicator for 1-DoF vibration

collision direction
Collision Propagated
vibrations
Vibrator
Camera viewer
Sight of camera
(a) Estimation of collision intensity and angle (b) Visuo-haptic feedback
Fig. 2.35 Concept of Visuo-haptic feedback system

66 M. Konyo et al.
shows the concept of a visuo-haptic feedback system. For the sensing side, a contact
estimation is based on a limited number of distributed vibration sensors, which
were installed inside the head body of ASC in order to avoid damage due to direct
collisions and abrasions. For the display side, visual and vibrotactile feedbacks are
combined to provide the operator both directional and temporal cues to perceive
contact events. It is expected that feedback information will support the operator to
understand surrounding environments and will result in improving the performance
of remote operations.
2.5.2 Related Works
Visual feedback systems are frequently used for supporting remote operations in
search and rescue activities. Visual information such as a multi-camera view and a
3D map provides a good sense of orientation for the operators [13]. In general, a
visual feedback system is superior in representing spatial information such as dis-
tances, directions, and locations. However, it is not superior in representing temporal
data such as the timing of collisions and their rapid movement because the human
vision has a limited temporal resolution, which is less than 30 Hz. Therefore, visual
feedback is not suitable for providing collision events, which include high-frequency
components of several hundred Hz.
On the other hand, recent studies have shown that even a single DOF vibrotactile
feedback, including high-frequency components to represent collision events, pro-
vides useful information to the operator for a telerobotic surgery [44]. In the case
of robotic surgeries, a sophisticated stereo camera system is available, and the oper-
ator can assess the contact situation from both the vision and vibrotactile feedback
systems. In the case of the ASC, only the head camera is available; thus, the oper-
ator cannot see the contact location directly. Therefore, a single DOF vibrotactile
feedback system for the ASC may not work well.
The approach using multiple vibrations is another way to provide spatial infor-
mation instead of visual feedback. Several studies have reported a vibration array
system to represent distance [56] and contact location [11] for complementing lim-
ited vision systems of rescue robots. Haptic illusions such as apparent motion and
phantom sensation may be useful for reducing the number of vibrators [31, 54].
However, multiple vibration approaches have limitations on intuitive judgments on
the perceived spatial information comparing with the visual information.
It is expected that a combination of visual and haptic feedbacks for the contact
events will work well in terms of both the spatial and temporal representations. It
has reported a combination of visual and multiple vibration feedback to represent the
contact orientation results to achieve better performance on the robot operation in a
simulator [14]. Our target is to combine the visual feedback of synthesized directional
information and haptic feedback, including high-frequency vibrations measured at
the ASC.
2.5.3 Active Scope Camera with Vibration Sensing

Mechanism
2.5.3.1 ASC and Bending Mechanism
The ASC with vibration sensing mechanism is developed based on the tube-shaped
ASC for vertical exploration. Figure 2.36a shows the overview. There are the two
steps of bending mechanism based on McKibben-type actuators, which generate
contractile and linear motion operated by pressurized air. Four bending actuators on
each step are arranged radially in the tube-shaped body. These actuators bend the
outer tube by the air pressure control.
2.5.3.2 Arrayed Vibration Sensors
Four piezoelectric vibration sensors (VS-BV203, NEC TOKIN), which have high
sensitivity (minimum detectable signal: 0.003 m/s2 ) and wide range responsibility
(10–15 kHz), are attached to the inner wall of the head body of ASC and circularly
arranged at regular intervals, as shown in Fig. 2.36b. Since the sensor can capture
vibration with a single degree of freedom, a combination of vibration signals from
the four sensors is used to estimate a collision angle and to provide a vibrotactile
feedback signal to an operator.
2.5.4 Experiment: Evaluation of Effect of Vibrotactile

Feedback on Reaction Time
For evaluating the fundamental performance of haptic feedback of collisions, the

difference in the response time to collision between with and without vibrotactile
feedback system is evaluated.
Bending
area
Vibration
sensors
Vibration
motor area
Body
Vibration
sensor area
Sensing direction
(a) ASC with vibration sensor (b) Closs-sectional view
array and bending mechanism of vibration sensor area
Fig. 2.36 ASC with vibration sensing mechanism and bending mechanism
68 M. Konyo et al.
40
Acceleration [m/s2]
Vibrator 0 Positions of ASC
40
an obstacle
0 A
40
0 C B
40
Amplitude [V]
0
2
D
25 ms
Joystick controller 0
-2
0 10 20 30 40 50 60 70 80 90
Time [ms]
(a) Haptic feedback for (b) Four sensor values and (c) Positional condition for
representing collision vibrator command measuring reaction time
Fig. 2.37 Experimental conditions
2.5.4.1 Apparatus
A vibrotactile signal is displayed to an operator by a single DoF voice-coil type

vibrator (Haptuator Mark II, Tactile Labs Inc.) that is attached to a joystick controller
as shown in Fig. 2.37a. A vibratory signal should be unified from the signals of the
four sensors because the vibrator has just a single DOF. Figure 2.37b shows one
example of the signals of the four sensors and the signal for the vibrator. The signal
that has a larger peak value is selected as the signal for the vibrator The signals are
sampled at 5 kHz and filtered with a moving average of 5 points. The time delay
between the measured signals of the four sensors and the signal of the vibrator is
approximately 25 ms. The actual delay between the haptic feedback and the video
is shorter than 25 ms because the camera feedback has a limited frame rate (30 fps).
It is reported that the threshold for users to notice the existence of a time delay
between vibrotactile feedback and hand movement is approximately 60 ms [52],
which supports that the time delay of this system is hardly noticeable.
2.5.4.2 Procedure
Figure 2.37c shows the condition in the experiment. The reaction time against a
collision to an obstacle was measured through a trial at eight conditions (with and
without vibrotactile feedback × four positional conditions of an obstacle.) In a trial,
the ASC started to move in one direction after a participant pushed a start button,
then a participant pushed an end button at the timing that the participant noticed the
collision of ASC against an obstacle. A camera image was displayed to a participant.
Measurements of reaction time were conducted 10 times at each condition at a
sampling frequency of 5 kHz. Two participants volunteered for the experiment and
were naive with respect to the objective of the experiment.
Log-transformed reaction time

Position A Position B Position C Position D
8.0 p < 0.05 p < 0.05
7.0
6.0
5.0
Without haptic feedback With haptic feedback
Fig. 2.38 Reaction time at the two conditions with and without haptic feedback
2.5.4.3 Result
It is known that a variation of human reaction time to stimuli does not follow a
standard distribution but rather an exponential Gaussian function [61]. Therefore,
for evaluating the difference in reaction time between the two feedback conditions, a
logarithmic transformation was applied to measured reaction time, which are shown
in Fig. 2.38. F-test showed that there is no difference in variation between the condi-
tions with and without haptic feedback ( p > 0.05) at the four positional conditions.
Student’s t-test showed significant differences between the two haptic feedback con-
ditions for the positions B and C (( p < 0.05) and ( p < 0.05), respectively), but there
is no differences for positions A and D (( p > 0.05) and ( p > 0.05), respectively.)
In terms of positions B and C, a velocity of ASC at a collision timing changed
flexibly; therefore, it is difficult for the participants to estimate a collision timing
based on visual ques, and haptic cues became a valuable information for a collision
estimation. These results indicated the vibrotactile feedback system supported the
operation of ASC.
2.5.5 Development of Visual Feedback System

of Collision Angle
First, a methodology for estimating a collision angle is developed based on the

measurement of the vibrations from eight collision angles. Second, a visual feedback
system for representing an estimated collision angle is developed.
2.5.5.1 Measurement of Collision Vibrations
The collision vibrations from eight different angles are measured. The measured
vibrations contain the driving noise of ASC; therefore, a low-pass filter with the
cutoff frequency 25 Hz is applied to the measured vibrations. Figure 2.39 shows the
measured vibrations from the different two angles (θ = 0, 5π/4).
70 M. Konyo et al.
Sen. 4 Sen. 3
Sensor 1 Sensor 2 Sensor 3 Sensor 4
Amplitude [-]
1
Sen. 2
0
Sen. 1
-1
0 0.5 0 0.5 0 0.5 0 0.5
Time [s] Obstacle
(a) Collision angle θ = 0 [rad]
Sensor 1 Sensor 2 Sensor 3 Sensor 4

Amplitude [-]
0 θ
-1
0 0.5 0 0.5 0 0.5 0 0.5
Time [s]
(b) Collision angle θ = 5π/4 [rad]
Fig. 2.39 Examples of measured vibrations
2.5.5.2 SVM Model for Estimating a Collision Angle
A collision angle is assumed to be related with the positive and negative peak values
of the four sensors. Therefore, the feature values P1k and P2k are defined as follows:
Tmaxk = arg max( f k (t)) (2.31)

Tmink = arg min( f k (t)) (2.32)
P1k = f (min(Tmaxk , Tmink )) (2.33)
P2k = f (max(Tmaxk , Tmink )), (2.34)
where f k (t) (k = 1, 2, 3, 4) is a filtered value for a sensor k. The training data for a
SVM model are eight feature values measured at 160 times (20 times for each eight
collision angles).
2.5.5.3 Cross-Validation of the Estimation Model
The result of cross-validation of the learned model is shown in Table 2.1. The results
show high estimation performance than 0.90, which indicate the learned model can
effectively estimate an collision angle.
2.5.5.4 Visual Feedback System for Representing an Estimated Angle
The interface for transmitting the estimated direction of collisions as a visual cue
to the operator is developed. Figure 2.40 shows the example of visual feedback
that represents the estimated collision direction and magnitude by red colored bars,
peripherally superposed on a video image.
Table 2.1 Result of cross-validation: Confusion matrix
Fig. 2.40 Examples of visual feedback
2.5.6 Conclusion
This section reported the approach to transmit the contact information of a remote-
operated Active Scope Camera (ASC) to an operator. Four sensors were installed
to the head of ASC to measure propagated vibrations, which provide vibrotactile
feedback signals to an operator. The evaluation experiment verified that the vibro-
tactile feedback reduce the reaction time to the collision with an obstacle. Then, the
methodology for estimating an collision angle using SVM is developed based on the
measured vibration signals from the eight collision angles. A cross-validation showed
that the developed method estimates an collision angle with high probability (90.0%
at the worst condition). Finally, the visual feedback method was developed, which
uses colored bars, peripherally superposed on the video image, to show the estimated
collision angle.
As the future development, the estimation of contact positions at the long body
of ASC is currently be developing. A long body of ASC sometimes lead to stack
with complex structures, and it is difficult for an operator to recognize this situation
from the tip camera image. Therefore, not only tip collision estimation, but also
the longitudinal contact estimation using vibration sensor array is expected to be a
useful feedback system. The concept of the ASC installed the estimation method of
72 M. Konyo et al.
Contact area and

intensity feedback
Possibility
of stack
Contact with ground is

necessary for driving ASC
Fig. 2.41 Concept of estimation of contact position at the long body of ASC
contact position at a long body is shown in Fig. 2.41. The vibrations sensor array
installed in the long body of ASC in the longitudinal direction can measure the
vibrations propagated from the driving vibration motors and may estimate contact
and non-contact conditions based on the difference in vibration.
2.6 Concluding Remarks
In this chapter, we introduced the overview of the ImPACT-TRC thin serpentine

robot platform and the specific technologies to enhance the mobility and sensing
capabilities. In the last period of the program, we develop a practical version of the
ImPACT-ASC, which has smarter structure and usability for the first responders. We
also evaluate the performance of the robot in a more complicated environment such
a simulated collapsed building in Fukushima Robot Test Fields, collaborating with
the first responders.
Acknowledgements This work was supported by Impulsing Paradigm Change through Disruptive
Technologies (ImPACT) Tough Robotics Challenge program of Japan Science and Technology
(JST) Agency.
References
1. Albl, C., Sugimoto, A., Pajdla, T.: Degeneracies in rolling shutter SfM. In: Proceedings of
European Conference on Computer Vision, pp. 36–51 (2016)
2. Ambe, Y., Yamamoto, T., Kojima, S., Takane, E., Tadakuma, K., Konyo, M., Tadokoro, S.:
Use of active scope camera in the Kumamoto Earthquake to investigate collapsed houses. In:
2016 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR), pp.
21–27. IEEE (2016). https://doi.org/10.1109/SSRR.2016.7784272, http://ieeexplore.ieee.org/
document/7784272/
3. Ando, H., Ambe, Y., Ishii, A., Konyo, M., Tadakuma, K., Maruyama, S., Tadokoro, S.: Aerial
hose type robot by water jet for fire fighting. IEEE Robot. Autom. Lett. 3(2), 1128–1135 (2018).
https://doi.org/10.1109/LRA.2018.2792701
4. Babacan, S.D., Luessi, M., Molina, R., Katsaggelos, A.K.: Sparse Bayesian methods for low-
rank matrix estimation. IEEE Trans. Signal Process. 60(8), 3964–3977 (2012)
5. Bando, Y., Itoyama, K., Konyo, M., Tadokoro, S., Nakadai, K., Yoshii, K., Kawahara, T.,
Okuno, H.G.: Speech enhancement based on Bayesian low-rank and sparse decomposition of
multichannel magnitude spectrograms. IEEE/ACM Trans. Audio Speech Lang. Process. 26(2),
215–230 (2018). https://doi.org/10.1109/TASLP.2017.2772340
6. Bando, Y., Itoyama, K., Konyo, M., Tadokoro, S., Nakadai, K., Yoshii, K., Okuno, H.G.:
Human-voice enhancement based on online RPCA for a hose-shaped rescue robot with a
microphone array. In: IEEE International Symposium on Safety, Security, and Rescue Robotics
(SSRR), pp. 1–6 (2015)
Microphone-accelerometer based 3D posture estimation for a hose-shaped rescue robot. In:
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 5580–5586
(2015)
8. Bando, Y., Itoyama, K., Konyo, M., Tadokoro, S., Nakadai, K., Yoshii, K., Okuno, H.G.: Vari-
ational Bayesian multi-channel robust NMF for human-voice enhancement with a deformable
and partially-occluded microphone array. In: Proceedings of the 21st European Signal Process-
ing Conference (EUSIPCO), pp. 1018–1022 (2016)
9. Bando, Y., Saruwatari, H., Ono, N., Makino, S., Itoyama, K., Kitamura, D., Ishimura, M.,
Takakusaki, M., Mae, N., Yamaoka, K., et al.: Low latency and high quality two-stage human-
voice-enhancement system for a hose-shaped rescue robot. J. Robot. Mechatron. 29(1), 198–
212 (2017)
10. Bishop, C.M.: Pattern Recognition and Machine Learning (Information Science and Statistics).
Springer, Berlin (2007)
11. Bloomfield, A., Badle, N.I.: Collision awareness using vibrotactile arrays. In: IEEE Virtual
Reality Conference 2007. VR’07, pp. 163–170 (2007)
12. Candès, E.J., Li, X., Ma, Y., Wright, J.: Robust principal component analysis? J. ACM 58(3),
11 (2011)
13. Chen, J.Y., Haas, E.C., Barnes, M.J.: Human performance issues and user interface design
for teleoperated robots. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 37(6), 1231–1245
(2007)
14. De Barros, P.G., Lindeman, R.W., Ward, M.O.: Enhancing robot teleoperator situation aware-
ness and performance using vibro-tactile and graphical feedback. In: 2011 IEEE Symposium
on 3D User Interfaces (3DUI), pp. 47–54. IEEE (2011)
15. Engel, J., Koltun, V., Cremers, D.: Direct sparse odometry. IEEE Trans. Pattern Anal. Mach.
Intell. 40(3), 611–625 (2018)
16. Engel, J., Schöps, T., Cremers, D.: LSD-SLAM: Large-scale direct monocular slam. In: Pro-
ceedings of European Conference on Computer Vision, pp. 834–849. Springer (2014)
17. Feng, J., Xu, H., Yan, S.: Online robust PCA via stochastic optimization. In: Advances in
Neural Information Processing Systems (NIPS), pp. 404–412 (2013)
18. Févotte, C., Dobigeon, N.: Nonlinear hyperspectral unmixing with robust nonnegative matrix
factorization. IEEE Trans. Image Process. 24(12), 4810–4819 (2015)
19. Fukuda, J., Konyo, M., Takeuchi, E., Tadokoro, S.: Remote vertical exploration by Active Scope
Camera into collapsed buildings. In: 2014 IEEE/RSJ International Conference on Intelligent
Robots and Systems (IROS), pp. 1882–1888. IEEE (2014). https://doi.org/10.1109/IROS.2014.
6942810, http://ieeexplore.ieee.org/document/6942810/
20. Fukuda, J., Konyo, M., Takeuchi, E., Tadokoro, S.: Remote vertical exploration by Active Scope
Camera into collapsed buildings. In: 2014 IEEE/RSJ International Conference on Intelligent
Robots and Systems, pp. 1882–1888 (2014). https://doi.org/10.1109/IROS.2014.6942810
21. Hartley, R.I., Zisserman, A.: Multiple View Geometry in Computer Vision, 2nd edn. Cambridge
University Press, Cambridge (2004). ISBN: 0521540518
74 M. Konyo et al.
22. Hatazaki, K., Konyo, M., Isaki, K., Tadokoro, S., Takemura, F.: Active scope camera for urban
search and rescue. In: 2007 IEEE/RSJ International Conference on Intelligent Robots and
Systems (IROS), pp. 2596–2602 (2007). https://doi.org/10.1109/IROS.2007.4399386, http://
ieeexplore.ieee.org/document/4399386/
23. Heyden, A., Åkström, K.: Minimal conditions on intrinsic parameters for euclidean reconstruc-
tion. In: Proceedings of Asian Conference on Computer Vision, pp. 169–176 (1998)
24. Heyden, A., Åström, K.: Euclidean reconstruction from image sequences with varying and
unknown focal length and principal point. In: Proceedings of Computer Vision and Pattern
Recognition, pp. 438–446 (1997)
25. Heyden, A., Astrom, K.: Flexible calibration: Minimal cases for auto-calibration. In: Proceed-
ings of International Conference on Computer Vision, pp. 350–355 (1999)
26. Hinton, G.E.: Training products of experts by minimizing contrastive divergence. Neural Com-
put. 14(8), 1771–1800 (2002)
27. Hioka, Y., Kingan, M., Schmid, G., Stol, K.A.: Speech enhancement using a microphone
array mounted on an unmanned aerial vehicle. In: International Workshop on Acoustic Signal
Enhancement (IWAENC), pp. 1–5 (2016)
28. Hoffman, M.D.: Poisson-uniform nonnegative matrix factorization. In: IEEE International Con-
ference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5361–5364 (2012)
29. Ishii, A., Ambe, Y., Yamauchi, Y., Ando, H., Konyo, M., Tadakuma, K., Tadokoro, S.: Design
and development of biaxial active nozzle with flexible flow channel for air floating active
scope camera. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems,
(Accepted) (2018)
30. Ishikura, M., Takeuchi, E., Konyo, M., Tadokoro, S.: Shape estimation of flexible cable. In:
(2012)
31. Israr, A., Poupyrev, I.: Tactile brush: drawing on skin with a tactile grid display. In: Proceedings
of the SIGCHI Conference on Human Factors in Computing Systems, pp. 2019–2028. ACM
(2011)
32. Ito, E., Okatani, T.: Self-calibration-based approach to critical motion sequences of rolling-
shutter structure from motion. In: Proceedings of Computer Vision and Pattern Recognition,
pp. 4512–4520 (2017)
33. Julier, S.J.: The scaled unscented transformation. In: American Control Conference, vol. 6, pp.
4555–4559 (2002)
34. Kahl, F., Triggs, B., Åström, K.: Critical motions for auto-calibration when some intrinsic
parameters can vary. J. Math. Imaging Vis. 13, 131–146 (2000)
35. Kamio, S., Ambe, Y., Ando, H., Konyo, M., Tadakuma, K., Maruyama, S., Tadokoro, S.:
Air-floating-type active scope camera with a flexible passive parallel mechanism for climbing
rubble. In: 2016 SICE Domestic Conference on System Integration (in Japanese), pp. 0639 –
0642 (2016)
36. Kim, T.: Real-time independent vector analysis for convolutive blind source separation. IEEE
Trans. Circuits Syst. I 57(7), 1431–1438 (2010)
37. Klein, G., Murray, D.: Parallel tracking and mapping for small AR workspaces. In: Proceedings
of IEEE and ACM International Symposium on Mixed and Augmented Reality, pp. 225–234
(2007)
38. Knapp, C., Carter, G.C.: The generalized correlation method for estimation of time delay. IEEE
Trans. Acoust. Speech Signal Process. (TASSP) 24(4), 320–327 (1976)
39. Konyo, M., Isaki, K., Hatazaki, K., Tadokoro, S., Takemura, F.: Ciliary vibration drive mech-
anism for active scope cameras. J. Robot. Mechatron. 20(3), 490–499 (2008). https://doi.org/
10.20965/jrm.2008.p0490
40. Lee, J., Ukawa, G., Doho, S., Lin, Z., Ishii, H., Zecca, M., Takanishi, A.: Non visual sensor based
shape perception method for gait control of flexible colonoscopy robot. In: IEEE International
Conference on Robotics and Biomimetics (ROBIO), pp. 577–582 (2011)
41. Li, Y., et al.: Speech enhancement based on robust NMF solved by alternating direction method
of multipliers. In: IEEE International Workshop on Multimedia Signal Processing (MMSP),
pp. 1–5 (2015)
42. Maybank, S.J., Faugeras, O.D.: A theory of self-calibration of a moving camera. Int. J. Comput.
Vis. 8(2), 123–151 (1992)
43. Mazumdar A, A.H.: Pulse width modulation of water jet propulsion systems using high-speed
coanda-effect valves. ASME. J. Dyn. Sys. Meas. Control. 135(5), 051019 (2013). https://doi.
org/10.1115/1.4024365
44. McMahan, W., Gewirtz, J., Standish, D., Martin, P., Kunkel, J.A., Lilavois, M., Wedmid, A.,
Lee, D.I., Kuchenbecker, K.J.: Tool contact acceleration feedback for telerobotic surgery. IEEE
Trans. Haptics 4(3), 210–220 (2011)
45. Miura, H., Yoshida, T., Nakamura, K., Nakadai, K.: SLAM-based online calibration for asyn-
chronous microphone array. Adv. Robot. 26(17), 1941–1965 (2012)
46. Mur-Artal, R., Montiel, J.M.M., Tardos, J.D.: ORB-SLAM: a versatile and accurate monocular
SLAM system. IEEE Trans. Robot. 31(5), 1147–1163 (2015)
47. Mur-Artal, R., Tardós, J.D.: ORB-SLAM2: an open-source SLAM system for monocular,
stereo, and RGB-D cameras. IEEE Trans. Robot. 33(5), 1255–1262 (2017)
48. Nakadai, K., Takahashi, T., Okuno, H.G., Nakajima, H., Hasegawa, Y., Tsujino, H.: Design
and implementation of robot audition system HARK – open source software for listening to
three simultaneous speakers. Adv. Robot. 24(5–6), 739–761 (2011)
49. Namari, H., Wakana, K., Ishikura, M., Konyo, M., Tadokoro, S.: Tube-type active scope camera
with high mobility and practical functionality. In: 2012 IEEE/RSJ International Conference on
Intelligent Robots and Systems (IROS), pp. 3679–3686 (2012). https://doi.org/10.1109/IROS.
2012.6386172, http://ieeexplore.ieee.org/document/6386172/
50. Newcombe, R.A., Lovegrove, S.J., Davison, A.J.: DTAM: dense tracking and mapping in real-
time. In: Proceedings of International Conference on Computer Vision, pp. 2320–2327. IEEE
(2011)
51. Nugraha, A.A., Liutkus, A., Vincent, E.: Multichannel audio source separation with deep neural
networks. IEEE/ACM Trans. Audio Speech Lang. Process. (TASLP) 24(9), 1652–1664 (2016)
52. Okamoto, S., Konyo, M., Saga, S., Tadokoro, S.: Detectability and perceptual consequences
of delayed feedback in a vibrotactile texture display. IEEE Trans. Haptics 2(2), 73–84 (2009)
53. Ono, N., Kohno, H., Ito, N., Sagayama, S.: Blind alignment of asynchronously recorded signals
for distributed microphone array. In: IEEE Workshop on Applications of Signal Processing to
Audio and Acoustics (WASPAA), pp. 161–164 (2009)
54. Ooka, T., Fujita, K.: Virtual object manipulation system with substitutive display of tangential
force and slip by control of vibrotactile phantom sensation. In: 2010 IEEE Haptics Symposium,
pp. 215–218 (2010)
55. Pollefeys, M., Koch, R., Gool, V.L.: Self-calibration and metric reconstruction inspite of varying
and unknown intrinsic camera parameters. Int. J. Comput. Vis. 31(1), 7–25 (1999)
56. Sibert, J., Cooper, J., Covington, C., Stefanovski, A., Thompson, D., Lindeman, R.W.: Vibro-
tactile feedback for enhanced control of urban search and rescue robots. In: Proceedings of the
IEEE International Workshop on Safety, Security and Rescue Robotics (2006)
57. Silva Rico, J.A., Endo, G., Hirose, S., Yamada, H.: Development of an actuation system based
on water jet propulsion for a slim long-reach robot. ROBOMECH J. 4(1), 8 (2017). https://doi.
org/10.1186/s40648-017-0076-4
58. Strecha, C., von Hansen, W., Gool, V.L., Fua, P., Thoennessen, U.: On benchmarking camera
calibration and multi-view stereo for high resolution imagery. In: Proceedings of Computer
Vision and Pattern Recognition (2008)
59. Sturm, P.: Critical motion sequences for monocular self-calibration and uncalibrated euclidean
reconstruction. In: Proceedings of Computer Vision and Pattern Recognition, pp. 1100–1105
(1997)
60. Suzuki, Y., Asano, F., Kim, H.Y., Sone, T.: An optimum computer-generated pulse signal
suitable for the measurement of very long impulse responses. J. Acoust. Soc. Am. 97, 1119
(1995)
61. Thomas, R.D.L.V.S., et al.: Response Times: Their Role in Inferring Elementary Mental Orga-
nization: Their Role in Inferring Elementary Mental Organization. Oxford University Press,
USA (1986)
76 M. Konyo et al.
62. Tully, S., Kantor, G., Choset, H.: Inequality constrained Kalman filtering for the localization and
registration of a surgical robot. In: IEEE/RSJ International Conference on Intelligent Robots
and Systems (IROS), pp. 5147–5152 (2011)
63. Wan, E.A., et al.: The unscented Kalman filter for nonlinear estimation. In: The IEEE Adaptive
Systems for Signal Processing, Communications, and Control Symposium, pp. 153–158 (2000)
64. Wu, C.: Visual SFM. http://ccwu.me/vsfm/
65. Wu, C.: Towards linear-time incremental structure from motion. In: Proceedings of International
Conference on 3D Vision, pp. 127–134 (2013)
66. Xu, Y., Hunter, I.W., Hollerbach, J.M., Bennett, D.J.: An airjet actuator system for identification
of the human arm joint mechanical properties. IEEE Trans. Biomed. Eng. 38(11), 1111–1122
(1991). https://doi.org/10.1109/10.99075
67. Yamauchi, Y., Fujimoto, T., Ishii, A., Araki, S., Ambe, Y., Konyo, M., Tadakuma, K., Tadokoro,
S.: A robotic thruster that can handle hairy flexible cable of serpentine robots for disaster
inspection. In: 2018 IEEE International Conference on Advanced Intelligent Mechatronics
(AIM) (2018). https://doi.org/10.1109/AIM.2018.8452708
68. Zhang, C., Florêncio, D., Zhang, Z.: Why does PHAT work well in lownoise, reverberative
environments? In: IEEE International Conference on Acoustics, Speech, and Signal Processing
(ICASSP), pp. 2565–2568 (2008)
69. Zhang, L., Chen, Z., Zheng, M., He, X.: Robust non-negative matrix factorization. Front. Electr.
Electron. Eng. China 6(2), 192–200 (2011)
Chapter 3
Recent R&D Technologies and Future
Prospective of Flying Robot in Tough
Robotics Challenge
Kenzo Nonami, Kotaro Hoshiba, Kazuhiro Nakadai, Makoto Kumon,

Hiroshi G. Okuno, Yasutada Tanabe, Koichi Yonezawa, Hiroshi Tokutake,
Satoshi Suzuki, Kohei Yamaguchi, Shigeru Sunada, Takeshi Takaki,
Toshiyuki Nakata, Ryusuke Noda, Hao Liu and Satoshi Tadokoro
Abstract This chapter contains from Sects. 3.1 to 3.5. Section 3.1 describes firstly
the definition of drones and recent trends. The important functions of the search
and rescue flying robot are also generally described. And, Sect. 3.1 consists of an
overview of R&D technologies of flying robot in Tough Robotics Challenge and a
technical and general discussion about a future prospective of flying robot including
the real disaster survey and technical issues. Namely, drones or unmanned aerial
vehicles (UAVs) should be going to real and bio-inspired flying robot.
Section 3.2 describes the design and implementation of an embedded sound source
mapping system based on microphone array processing for an unmanned aerial vehi-
cle (UAV). To improve search and rescue tasks in poor lighting conditions and/or
from out of sight, a water-resistant 16 ch spherical microphone array and 3D sound
source mapping software running on a single-board computer have been developed.
The embedded sound source mapping system demonstrates that it properly illustrate
human-related sound sources such as whistle sounds and voices on a 3D terrain map
in real time even when a UAV is inclined.
K. Nonami (B)
Autonomous Control Systems Laboratory, Chiba, Japan
e-mail: nonami@acsl.co.jp
K. Hoshiba
Kanagawa University, Yokohama, Japan
e-mail: hoshiba@kanagawa-u.ac.jp
K. Nakadai
Tokyo Institute of Technology/Honda Research Institute Japan Co., Ltd., Tokyo, Japan
e-mail: nakadai@jp.honda-ri.com
M. Kumon
Kumamoto University, Kumamoto, Japan
e-mail: kumon@gpo.kumamoto-u.ac.jp
H. G. Okuno
Waseda University, Tokyo, Japan
https://doi.org/10.1007/978-3-030-05321-5_3
78 K. Nonami et al.
Section 3.3 describes a variable pitch control mechanism which makes a drone
more robust against strong wind. Since the mechanism controls the thrust by changing
the pitch angles of rotor blades, it improves the responsivity of thrust control. With
this mechanism, all rotors of the drone can be driven by two motors located near its
center of gravity. It decreases the moments of inertia of the drone. To improve the
robust-ness of drones, employing ducted rotors, thrust behavior near walls, guidance
control laws, and an estimation of induced velocity are also proposed and discussed.
Section 3.4 describes a mechanical concept of a robot arm and hand for multi-
copters. An arm with four joints and three actuated degrees-of-freedom is developed,
which can be attached to an off-the-shelf multicopter. As application examples, this
study shows that the developed multicopter can grasp a 1 kg water bottle in midair,
carry emergency supplies, hang a rope ladder, etc.
Y. Tanabe
Japan Aerospace Exploration Agency, Tokyo, Japan
e-mail: tan@chofu.jaxa.jp
K. Yonezawa
Central Research Institute of Electric Power Industry, Tokyo, Japan
e-mail: koichi-y@criepi.denken.or.jp
H. Tokutake
Kanazawa University, Kanazawa, Japan
e-mail: tokutake@se.kanazawa-u.ac.jp
S. Suzuki
Shinshu University, Matsumoto, Japan
e-mail: s-s-2208@shinshu-u.ac.jp
K. Yamaguchi · S. Sunada
Nagoya University, Nagoya, Japan
e-mail: kohei.yamaguchi@mae.nagoya-u.ac.jp
S. Sunada
e-mail: shigeru.sunada@mae.nagoya-u.ac.jp
T. Takaki
Hiroshima University, Hiroshima, Japan
e-mail: takaki@hiroshima-u.ac.jp
T. Nakata · H. Liu
Chiba University, Chiba, Japan
e-mail: tnakata@chiba-u.jp
H. Liu
e-mail: hliu@faculty.chiba-u.jp
R. Noda
Kanto Gakuin University, Yokohama, Japan
e-mail: rnoda@kanto-gakuin.ac.jp
S. Tadokoro
Tohoku University, Sendai, Japan
3 Recent R&D Technologies and Future Prospective of Flying Robot … 79
Finally, the bio-inspired low-noise propellers for drones are introduced in Sect. 3.5.
A prototype low-noise propeller specified for two drones of Phantom3 and PF1 is
developed inspired by the unique trailing-edge fringes of owls. A small plate attached
to trailing edge is found to be capable of suppressing noise level effectively while
there is a trade-off between the acoustic and aerodynamic performances.
3.1 Overview of Flying Robot
3.1.1 Flying Robots: Using UAVs to Understand

Disaster Risk
The statistic shown in Fig. 3.1 displays the annual economic loss caused by natural
disaster events worldwide from 2000 to 2016. In 2017, some 353 billion U.S. dollars
were lost due to natural disasters. The global average economic losses during this
period was about 134 billion U.S. dollars [1]. The biggest loss was in 2011 which is
attributed to the Great East Japan Earthquake. In 2017 and 2015, the great hurricane
damage to the United States occurred. Access to the sky becomes extremely effective
because the land route is cut off in such a disaster. Therefore, manned helicopters
were effective for access from the sky, but from now on it is expected to be a compact
drone which can be operated cheaply, promptly, safely and accurately. According to
PwC Consulting LLC, disaster search and rescue drone will be expected to be around
10 billion US dollars in the world in the mid-2020s.
Drone is defined as a computer-controlled flying robot capable of autonomous
flight controlled by an embedded computer instead of maneuvering. Drone is also
well known as a small unmanned aerial vehicle, Unmanned Aerial Vehicle (UAV),
Unmanned Aerial System (UAS), Remotely Piloted Aircraft System (RPAS). Radio
Fig. 3.1 Economic loss from natural disaster events globally from 2000 to 2017 (in billion U.S.
dollars) [1]
80 K. Nonami et al.
controlled flying machines that people manipulate wirelessly are also included in the
drone because a rate gyro feedback has been already installed. Drone was initially
deployed in the Iraq War and the Afghan War in actual warfare from around 2000
after testing flight before World War II for military purposes. Meanwhile, sensors,
microprocessors, etc. have been miniaturized in the evolution of mobile phones
and smart phones, and the performance has been drastically improved. In response
to these benefits, a multi-rotor helicopter capable of electric-powered autonomous
flight of several hundred grams to several kilograms was born about 20 years ago.
These drones ranging from hobby use to industrial use became to be said to
bring about “the industrial revolution of the sky”. Considering these drones from the
viewpoint of technical completeness, there are many problems in terms of safety,
reliability and durability, and it is at the stage of dawn, so far it was mainly used
for hobby mainly. However, from 2016 gradually the utilization to industrial use
is beginning to expand. Major industrial drone applications are short range flights
on visual line of sight, agriculture, infrastructure inspection, surveying, security, etc.
were mainstream. However, recently it became possible to fly the long distance flight
of 10–20 km or more, and the long duration flight like near one hour. Therefore it is
getting to be used for not only logistics and home delivery but also disaster survey.
Flying robots for search and rescue have several important functions which are
different from manned helicopters and so far. (1) By issuing immediately after the
occurrence of a disaster, there is a correct grasp of the precise disaster damage situ-
ation by aerial photographing with low cost, speedy and ultra low-flying altitude, so
that rescue workers from the ground can grasp more detailed situations on the ground,
moreover it is possible to respond accurately. Furthermore, by collecting information
with a drone against a place where people can not enter, secondary disasters can be
prevented. At the same time, it is possible to check the latitude and longitude of the
rescuer who needs it based on the position information of the drone. Recently it is
possible to reconstruct 3D map from aerial photographs and laser survey with high
precision. (2) Utilizing an infrared camera can confirm the heat source that can not
be confirmed with a visible camera, it is useful for searching people during moun-
tain site and checking heat sources such as fire sites. (3) When there are necessary
rescuers in places where people can not enter the disaster site, relief supplies can
be transported. It is possible to transport AED (Automated External Defibrillator) to
necessary places on a pinpoint, it can be dropping a lifesaving bladders in a water
accident case, and other suitable equipment such as radio equipment and food can
be transported. (4) By installing the speaker function in the drones, it is possible for
the operator to give a voice call or instruction from the ground control side to the
requisite rescuer left in the site where the person can not enter. (5) And by transmit-
ting the images of the disaster site in real time to the Disaster Relief Headquarters in
the future, it is also possible to instruct precisely by voice from the headquarters.
The book [2] authored by R.R. Murphy specifically describes the basic concept
and operation method of a comprehensive disaster search and rescue robot based
on the experiences of using real disasters in past cases. In particular, the book is
also proposing details on the form and operation of disaster search and rescue flying
robots as to what flying robots should be. However, it does not deeply enter the
technical point of view. On the other hand, there are book [3] that are described
comprehensively and systematically, including technical viewpoints on UAV and
flying robots. For UAV control, nonlinear control, adaptive control and fault tolerant
control are also studied in detail, but all are fixed wing fields. It also discusses
autonomy, but it refers to the autonomy of 10 levels aimed at military use of the
Department of Defense [3, 4]. The research that categorized the autonomy of civilian
UAV into five stages also appeared in recent years, but the difference is not clearly
defined [5, 6]. This chapter discusses issues to be achieved as functions of disaster
search and rescue flying robots from a technical point of view.
Section 3.1 consists of two parts. In part 1, Sect. 3.1.2 shows an overview of the
flying robot in Tough Robotics Challenge (TRC), and Sect. 3.1.3 shows valuable
results actually applied as a flying robot in TRC when a disaster actually occurred.
In part 2, Sects. 3.1.4–3.1.6 are looking ahead to the future image of the flying robot
including disaster search and rescue flying robot from a technical point of view.
Among them, some of the research results of flying robot in TRC are introduced.
Section 3.1.7 is a summary of this section.
3.1.2 Overview of Recent R&D Technologies of Flying Robot

in Tough Robotics Challenge
Flying robot team in “Tough Robotics Challenge(TRC)” consists of 11 universities

and 5 research instituts and 10 research topics are going on. Particularly we are
aiming to realize the next generation flying robot as search and rescue for disaster
as shown in Fig. 3.2. Current drone has various problems, but if tasks such as shown
in Fig. 3.2 can be successfully accomplished, it will be expected to be a dramatic
success as an aerial robot called a flying robot from drone.
Autonomous Control Systems Laboratory Ltd. (ACSL) have been providing to
flying robot teams a platform drone ACSL PF1 made in ACSL. ACSL PF1 consists
of an original autopilot and non-linear control algorithm. The flight performance
of the ACSL drone is extremely excellent, in particular, the stability against gust
wind is high performance because it implements nonlinear control. Another feature
is that long distance flight and high speed flight are possible. This performance
is indispensable, as it is necessary for the disaster survey flying robot to access
the disaster site fastest and collect the correct information. Also, ACSL is engaged
in advanced control such as fault tolerance control at the occurrence of abnormal
situation and self tuning control ensuring robust stability and optimal adaptivity
even if the payload changes.
The advanced initiatives shown in Fig. 3.2 are as follows, Variable pitch and ducted
propeller type flying robot demonstrating high maneuverability against gusts, Flying
robot to deal with such as manipulators and hands performing aerial work or high alti-
tude danger work, Flying robot with silent wings for suppressing noise when flying in
urban areas, Disaster survey flying robot with auditory sensing for searching for sur-
vivors, Autonomous radio relay method to prevent radio wave disruption between the
82 K. Nonami et al.
Fig. 3.2 Overview of recent R&D technologies of flying robot in tough robotics challenge
aircraft and the ground control station (GCS), High precision positioning requiring
no electronic compass, Multiple resolution DB for three dimensional environmental
recognition, Light-weight and small size device which can be measured local wind
profile, Supervisor control and model predictive control, etc. Next-generation disas-
ter survey drone appears as a highly-implemented flying robot with such technology.
3.1.3 Survey on Damaged Torrential Rain in Northern

Kyushu and Technical Issues (Case Study) [7]
3.1.3.1 Introduction
In July, 2017, the record heavy rain attacked the northern part of Kyushu, mainly in
Fukuoka prefecture and Oita prefecture, and disasters such as landslides and road
damage occurred one after another. In response to this disaster, the Cabinet Office
used the project of Prof. Satoshi Tadokoro program manager of the Innovative R &
D Promotion Program (ImPACT) and the project developed by Autonomous Control
Systems Laboratory, Ltd. (ACSL) on the disaster site, and dispatched investigation
groups to the site along with drone the day after the occurrence of the disaster. Here,
as a member of this survey group, the authors will describe what ACSL employees
and others in the investigation of the disaster site did on site and issues that were
highlighted by actually operating the drones at the disaster site.
3.1.3.2 Flying Robot Utilized in Disaster Survey
The drone used in the investigation of this disaster site is ACSL surveying drone (PF
- 1 Survey) (see Figs. 3.3, 3.4 and Table 3.1). This drone can stably fly even at a speed
of 72 km/h, and this high speed flight performance and a high-speed camera of 4 eyes
mounted below the drone make a wide range of surveying in a short time is possible.
In addition, the platform drone has acquired IPX 3 certification, and can fly even in
heavy rain and strong winds which are difficult to fly in general drones, and can be
operated even in harsh environments like disaster sites. It is possible for ACSL drone
not only to fly by manual piloting but also to let autonomous flying along prescribed
route in advance. The information of the drones during flight can be displayed on
GCS application software on the personal computer by radio communication even if
it is several kilometers apart in a place with good visibility (see Fig. 3.5). In addition,
a flight recorder which can record the scenery in front of drone as a movie is also
Fig. 3.3 Utilized small drone PF1-Survey made in ACSL

84 K. Nonami et al.
Fig. 3.4 Quad-lens camera
Table 3.1 The specification of PF1-Survey made in ACSL
installed, and if the video transmission device is additionally installed, it can also
monitor the video in real time on PC.
The quad-lens camera shown in Fig. 3.4 can take crisp photographs even at high
flying speeds. Stitching aerial photographs for professional measurement and sur-
veying requires photographs with overlap ratios of about 80%. Most cameras on the
market cannot keep up with the high capturing frequency needed to meet this require-
ment at high flight speeds. ACSL has developed a quad- lens camera that takes crisp
photographs at a high speed rate. In a single flight, a PF1-Survey can cover 100 acres
Fig. 3.5 The display of GCS (Ground Control Station) for disaster survey
capturing 20 mega-pixel images with 80% overlap at an resolution of 3–4 cm2 /pixel.
Precision can be traded for even larger coverage by adjusting altitude depending on
the need. The airframe is also equipped with an FPV flight recorder with innovative
digital stabilization that doesn’t require a mechanical gimbal. Also, the specification
of PF1-Survey made in ACSL is shown in Table 3.1.
3.1.3.3 Disaster Survey by Means of Flying Robot
The first time the ACSL staffs arrived at Toho Village in Fukuoka Prefecture, which
is the site of the disaster where the ACSL staffs were asked for survey, it was around
3 pm on Saturday, July 8, 2017 two days after the occurrence of the disaster. In the
disaster site, Japan Self-Defense Forces and firefighters have been carrying out the
search activities for missing people and the work of removing driftwoods. At that
time, because the road leading from the village office to the mountain was closed by
landslides and road damage, there were areas where it is difficult for people to enter
the site. It was the purpose of the ACSL staffs to investigate the situation by drone.
The survey area is about several kilometers from the limit where people can enter,
but the investigation site are surrounded by high mountains, and the place we wanted
to investigate was shadow of the mountain which is not visible.
Initially, the staffs were planning to investigate by autonomous flight of the drones.
But for that purpose it was necessary to download the map information around the
site onto the personal computer and create a drone flight path. However, since the
Internet environment was completely restricted on the site, it was impossible to
acquire map information in advance. Therefore an appropriate flying route was not
planned immediately. Also, since the flying beyond visual line of sight without any
observers is prohibited, it is necessary to obtain permission from the Ministry of Land,
86 K. Nonami et al.
Infrastructure and Transport in order to fly a range beyond visual line of sight which is
behind the mountain. The staffs judged it difficult to conduct full-scale investigation
on this day. For this reason, on the first day, the staffs tried to fly only about 300 m
within the visual range by manual piloting and photographed the surroundings in a
simple manner.
The time of the second day, the staffs downloaded the map information around
the site to be surveyed in an environment that can be connected to the Internet, and
planned the round flying route about 6 km. When the staffs arrived at the site, the
restoration work was proceeding as well as the day before, and the area where people
can enter can be expanded somewhat, but still the mountain side was in a state where
it could not enter. As long as the long distance flight without observers beyond visual
line of sight was already approved through the Cabinet Office as a special case of
emergency for the search and rescue of Japanese aviation law 132-3, as the staffs
were ready to fly drones soon. However, there were multiple manned helicopters
of rescue and reporting on the scene, and there was a high possibility of colliding
with them when flying a drone as it was. Therefore, the staffs carefully proceeded to
prevent accidents by closely contacting the Cabinet Office and the fire department
and grasping the time zone during which the manned helicopter is not flying and
completely separating the airspace. Therefore, it was about two hours after arrival at
the site that the staffs were able to actually fly the drones.
On this day, the staffs performed autonomous flight experimentally twice in the
beginning, confirmed that there were no problems with drone and recording equip-
ment, and then carried out autonomous flight aimed at full-scale investigation for
the third time. Figure 3.6 shows the flight route when the drones are actually made
to autonomously fly at this time. It was the purpose of the survey on the second day
to investigate the area where the fire department indicated in red had not yet investi-
gated, and the flight route indicated by the purple line had an altitude of about 100 m
from the ground at a flying speed of about 54 km/h. The staffs planned to make a
Fig. 3.6 Survey target area, BVLOS area and autonomous flight trajectory planning
Fig. 3.7 One screen of the movie taken by the flight recorder
round trip of about 6 km and 12 min flight. The drones hid behind the mountain as
they took less than 1 km from the takeoff point, and it became beyond visual line of
sight.
Wireless communication was also disconnected, and it became impossible to
pilot by the transmitter and monitor the drones by the ground control station. (In
addition, when the communication of the transmitter is disrupted, the failsafe function
normally works so that the drones usually return to the takeoff point, but this time in
the investigation, it was specially made to disable this function.) Going further, the
staffs reached the area where the fire department had not enter yet. A high tension
situation continued for a few minutes that drone information could not be obtained
at all after the drone became beyond visual line of sight and the communication was
disconnected, but after about 7 min from the start of flight, drone returned safely at
the scheduled time and finish the survey. Figures 3.7 and 3.8 show pictures taken
by Drone at this time. Figure 3.7 is a scene of a movie recorded by a flight recorder
attached to the front of the drone, and Fig. 3.8 is a part of a picture taken by a survey
camera mounted below the drone. Furthermore, Fig. 3.9 is an ortho image generated
through PhotoScan based on a large amount of photograph taken by the surveying
camera, the quality of these data collected in this survey realized 2 cm/pixel, and it
was confirmed that it was sufficient for the needs of the site.
3.1.3.4 Summary and Technical Issues
The staffs conducted a survey of disaster sites by fully autonomous long distance
flight and beyond visual line of sight, successfully shot videos and pictures of places
where people could not enter. The collected data was provided to the Fire and Disaster
Management Agency of the Ministry of Internal Affairs and Communications, the
88 K. Nonami et al.
Fig. 3.8 Screen shot by 4 eye camera
Fig. 3.9 Orthorectified images generated from the images captured by the four-eye camera
Cabinet Office, the Fukuoka prefectural government office, etc., and was utilized to
create temporary roads until isolated villages. This time was the first time for the
staffs to use the drone in the investigation of the real disaster site, and although a
certain degree of outcome was obtained as a result, the staffs were able to confirm
many problems and points of reflection at the same time. Finally, the staffs will list a
couple of them. In the future, the staffs realized that not only technological evolution
of drone, but also legal improvement and establishment of operation system are
strongly necessary for drone to fully operate as a disaster response robot.
(1) First, it took time to transport the drone to Fukuoka Prefecture, and then in
Fukuoka Prefecture it took time to reach the site because there was a road which
was closed due to heavy rain at that time. For this reason, we arrived at the site
for the first time in the afternoon two days after the occurrence of the disaster,
and it was impossible to conduct a survey on the day of the disaster occurrence
that needs the most information or the next day. In the future, it is necessary to
prepare drone in advance in places where damage is expected, and to prepare a
system that can be used immediately when necessary.
(2) There were several manned helicopters flying locally, and the drone was not in
a situation where it could fly freely. This time we were able to fly the drones by
grasping the time zone where the manned helicopter does not fly and completely
separating the airspace, but from now on, it is necessary to prepare a mechanism
that allows manned helicopters and unmanned UAVs as drone to coexist.
(3) Mobile phones and Internet networks were restricted in the vicinity of the site,
so it was impossible to communicate with the outside and gather necessary
information. In particular, it was a great loss that autonomous flight could not
be done on the first day because map information could not be captured on the
site. It is important to assume situations of disaster sites and go to the site after
preparing necessary preparations sufficiently.
(4) Creating autonomous flight path of drone was conducted while confirming the
altitude line of the map, etc. However, considering the existence such as the
possibility that the terrain changes greatly at the time of a disaster or the power
line not understood on the map, the autonomous flight that depended on GPS
information may be inadequate. In order for drone to safely perform autonomous
flight in places with many uncertain factors such as a disaster site, the drone
himself needs to be able to recognize surrounding situations and sometimes
avoid obstacles.
(5) Because the wireless communication was interrupted on the way due to long
distance flight beyond visual line of sight, information on drone was not obtained
for most of the time and it was not possible to operate. Although it was able to
return safely this time, in actual operation it is desirable to be able to monitor
in real time the information on the drowning in flight, photographed images etc
in the flight, in some cases it may be necessary to instruct pause or return in
the middle of the mission. In the future, it is necessary to establish a method of
wireless communication such as satellites that can be used even in places with
long distances and poor visibility.
3.1.4 Roadmap for the “Industrial Revolution in the Sky”

and Correlation Between Flight Level and Autonomy
(Safety)
The “Roadmap for the Application and Technology Development of UAVs in

Japan [8]” (April 28, 2016, The Public-Private Sector Conference on Improving
the Environment for UAVs), which was set by the Japanese government, defines the
flight level of small unmanned aerial vehicles (drones) as in Fig. 3.10. It defines the
flight level into four stages, where Level 1 is a radio control level, Level 2 is an
autonomous flight drone with visual line of sight (VLOS), Level 3 is an autonomous
90 K. Nonami et al.
Fig. 3.10 Flight levels of small unmanned aerial vehicles (drones) [8]
Fig. 3.11 Difference between AP and FC and guidance, navigation, and flight control [9]
flight with beyond visual line of sight (BVLOS) without any observer in an less-
populated area, and Level 4 is an autonomous flight with beyond visual line of sight
(BVLOS) in a populated area. It is anticipated that Level 3 is achieved in about 2018
and Level 4 in about 2020s.
The autopilot(AP) is an integrated system of hardware and software of guidance
(G), navigation (N), and control (C) by which the drone is capable of carrying out a
range of flight from a programed flight such as a basic waypoint flight to an advanced
autonomous flight, for example, a flight while avoiding the obstacle and carrying
out the trajectory plan in real time by itself. Figure 3.11 presents the difference
between AP and FC. AP contains FC, i.e., AP is a broader concept, which also
comprehends the work of skilled pilot of the manned aerial vehicle. In The manned
aerial vehicle, a skilled pilot carries out obstacle recognition and decision making,
in other words, guidance, meanwhile the unmanned aerial vehicle is pilot-free and
hence that role needs to be played by the on-board computer and the ground support
system. On the other hand, the flight controller (FC) is an integrated system of
hardware and software that carries out flight while keeping the unmanned aerial
vehicle in a stable state in accordance with a given flight trajectory. In the case of
flight of a commercially available hobby-use quadcopter, AP is implemented with
only the lower order structure, where AP = NS + FC. In this case, FC is continuously
calculating a command for controlling the rotational speed of motor based on input
of pilot while keeping the plane attitude in a stable state.
The degrees of autonomy of drone in which Level 3 and Level 4 of Fig. 3.10 are
achieved are given in Fig. 3.12. Figure 3.12 presents the autonomy of drone, which
is almost a synonym of safety, classified into five stages from Class A to Class
E, with the concept, the guidance level, the navigation level, the control level, and
the scenario on an assumption of logistics, all of which are detailed foreach of the
five stages. Class E is a level at which the operation skill by human is put to the
test in a radio control operation drone. Class D is a class in which an autonomous
flight is made possible as a so-called waypoint flight, i.e., a programed flight with
everything from take-off to landing is determined in the trajectory plan made in
advance by human on an assumption that the GPS radio wave can be received.
The guidance is all judged by a skilled person. It is a class in which everything
is processed by the on-board CPU, automatically notifying communication failure,
compass abnormality, remaining battery level, and so on. Most of the industry drones
commercially available now can be judged as Class D. Class C is for drones capable of
Fig. 3.12 Autonomy (safety) class of drone and future roadmap (Class A–E) [7]
92 K. Nonami et al.
autonomous flight even under a non GPS environment. It takes various methods such
as image processing using a camera, laser, lidar, total station, sounds, and radio wave.
Various drone abnormality notifications and the like are similar to those of Class D and
the presence and absence of mission continuation is judged by a human. The world’s
state-of-the-art drones as of 2017 are thought to be in Class C, close to Class B though.
Class B is for advanced drones like flying robots that will appear in around 2019. They
are defined as drones (flying robot) that will never crash, which will autonomously
deploy the parachute or the like and make a crash landing before crashing if an
abnormality occurs. To do this, the abnormality diagnosis algorithm is constantly
activated during flight and, if the health condition of the flying robot is different from
the normal condition, the cause of the abnormality is identified and whether or not the
mission is continued is autonomously judged. So, the guidance basically depends
on the autonomy of the drone (flying robot) side. SAA (Sense and Avoid) is also
realized in this Class B. SAA is associated with discovery and immediate avoidance
of an obstacle present forward in flight and trajectory re-planning on a real time
basis. Class A is an ideal form of flying robot, which can be called a biologically
inspired flight (taking principals from nature but implementing them in a way that
goes beyond what is found in nature), i.e., flying like a bird. GPS radio wave is not
necessary anymore. It carries out high speed image processing from images taken
by the camera or the like that is mounted on the flying robot. It thus carries out
self-positioning estimation. The flying robot itself is capable of recognizing where it
is flying currently. The flying robot has an ability of reaching the destination that is
even 10 km away or farther with a landmark on the ground without using GPS radio
wave. It is a class in which of course the flying robot may receive a support of UTM
(UAV Traffic Management System) where necessary and is capable of safe flight with
perceiving in advance a flying robot abnormality while carrying out FTA (Fault Tree
Analysis), which is a fault analysis during flight. In this stage, the learning effect of
artificial intelligence(AI) can also be utilized, where the more the flying robot flies,
the more intelligent the autonomous flight becomes. This is supposed to be realized
in 2020s.
Figure 3.13 presents correlation between the flight level set by the government of
Japan as in Fig. 3.10 and the evolution level of drone as in Fig. 3.12. The figure gives
a concept of what degree of drone autonomy (safety) can permit what degree of flight
level. The beyond visual line of sight (BVLOS) flight for long distance without any
observer of Level 3 has to be the middle of Class C or higher, and a certain extent
of capability of abnormality diagnosis of the drone is desirable. If it is preferably
evolved to Class B, SAA function is implemented, where abnormality diagnosis can
almost autonomously respond, and the flying robot has the function of autonomously
detecting the abnormality depending on the result of the abnormality diagnosis and
activating a safety device to prevent the flying robot from crashing. Therefore, it
can be judged as a level that has no problem in autonomous flight in less populated
areas. The autonomous flight in populated areas of Level 4 has to be in Class A.
In particular, it is capable of immediately recognizing change in three-dimensional
environment such as weather, radio wave, and magnetic field, and is fully provided
with the guidance abilities such as FTA analysis and crisis management capacity.
Fig. 3.13 Correlation diagram between flight level of drone and autonomy (safety) [7]
In that sense, it is an unmanned aerial vehicle close to the manned aerial vehicle in
terms of safety design, which is expected to significantly reduce the probability of
accidents in populated areas.
3.1.5 Autonomous Control Technology Required

in Autonomy (Safety) Class B of Drone
Fault of multi-rotor helicopter during autonomous flight can be roughly divided into
four: the first is communication system fault related to uplink and downlink between
the ground and the drone; the second is sensor system fault related to navigation such
as in-IMU sensors and barometer, GPS receiver, INS-related, and vision; the third
is control system fault mainly in the micro-computer board that carries out control
calculation and peripheral devices; and the fourth is multicopter propulsion system
fault mainly in the drive system. These faults can be handled in general by employing
a redundant system. However, if it is impossible to employ a redundant system due to
various restrictions, the fault tolerant control, which is presented below, is effective.
In particular, it is difficult in general to realize employing a redundant propulsion
system from a point of view of size, weight, and cost.
So, ACSL introduces the autonomous control technology that targets at fault
tolerant control of propulsion system of the multi-rotor helicopter. The propulsion
system of the multi-rotor helicopter is made up of a propeller, a motor, a motor
driver, and a battery. All of damage sand faults of these components are propulsion
94 K. Nonami et al.
Fig. 3.14 Fault tolerant control system (fault tolerant control) [7]
system fault. Figure 3.14 presents a fault tolerant control system against propulsion
system fault. Its basic idea is as follows: the computer has a physical model that
simulates an actual system, and as in Fig. 3.14; a control input is input to these actual
model and physical model; outputs are obtained from the both; and a difference
between them is obtained. If the difference is within a permissive range, abnormality
is not present. If it exceeds the permissive range, it is judged that abnormality is
present and an inverse problem called fault system analysis called FTA (Fault Tree
Analysis) up to what is the abnormality is solved. In Fig. 3.14, above the dashed line
denotes software implemented in the supervisor, where the abnormality diagnosis
algorithm carries out FTA analysis at the fastest speed using the physical model. If
a fault occurs, fault information is transmitted to a re-building section, the control
structure is switched to the optimal controller and control parameter, and thus the
flight is continued. Even though the control structure is momentarily switched, it
is still a switch in a finite time. In the case of a hexa-rotor drive system fault, one
ESC, drive motor, and propeller system that was fault is stopped, and the control
structure is momentarily changed. It is also important the controller robustness for
the multicopter behavior not to greatly fluctuate in a period of time from when the
control structure was changed to when the control is started by the five motors. For
this reason, the sliding mode control, which is a non-linear control that is capable of
exerting a robust control performance, is applied, so that one motor is momentarily
stopped, the control structure is changed, and thus the drone attitude is stabilized.
Next, ACSL will discuss methods to constantly optimize the controller by adapt-
ing an environment change during flight. One of the methods is the self-tuning
control. On an assumption of a home delivery drone, after delivering a parcel, the
drone weight will be come light, which accordingly causes the center of gravity of
the drone to be moved and the drone inertia main axis to fluctuate. In particular,
in the case where no measure has been taken, the drone will fly in a state where
the controller is deviated from the optimal state, such as a sharp rise of the drone,
Fig. 3.15 Block diagram of

self-tuning [10]
reduction in response speed, and generation of steady-state deviation. In the worst

case, the control system becomes instable, resulting in accidents such as crash. Now,
ACSL will discuss a method to prevent the control performance from becoming poor
by appropriately adjusting the controller parameter with the self-tuning technique
of the adaptive control theory even if the drone weight is momentarily fluctuates.
The self-tuning is one of the control techniques that assumes an unknown parameter
included in the control target. A general block diagram of the self-tuning is presented
in Fig. 3.15 [10]. At the time of beginning of control, the controller is designed with
the control target regarded as known, and the unknown parameter is estimated in
sequence and reflected into the controller. This configures a controller using the
unknown parameter that has been online identified in the end.
Figure 3.16 presents a slow-motion picture of a behavior when an actual drone
ACSL-PF1 developed by Autonomous Control Systems Laboratory Ltd. (ACSL)
with the algorithm being implemented drops two liters of water momentarily (about
0.1 s). It has successfully reduced the drone rise to about 5 cm without rapidly rising
in spite of mass change of 2 kg.
Fig. 3.16 Drone behavior when it dropped 2 liters of water [10]

96 K. Nonami et al.
Fig. 3.17 Ideal state of autonomous control in near future [7]
3.1.6 Ideal State of Drone Autonomous Control in Near

Future
As described with reference to Figs. 3.11 and 3.12, the guidance (G: Guidance), the
navigation (N: Navigation), and the control system (C: Control) play an important
role in order to carry out an autonomous flight while quickly recognizing an obstacle
in front of the drone along the flight, autonomously carrying out crash avoidance,
recognizing a complicated environment, and self-generating the flight route. These
three elements (GNC) are the core technology for a fully autonomous control flight
and will be rapidly evolved in future as a brain of autonomous flight. In particular,
as seen in logistics and home delivery drones, when the level of autonomous control
flight is advanced to be beyond visual line of sight (BVLOS) flight and long dis-
tance flight, the guidance, the navigation, and the control system will determine the
performance in a crucial manner.
The guidance system in the UAVs plays a similar role to the cerebrum of human,
i.e., in charge of recognition, intelligence, and determination as presented in Fig. 3.17.
It carries out so-called real time route generation, i.e., autonomous flight with deter-
mining a target trajectory real time while detecting an obstacle and avoiding crash
even in a complicated unknown environment. In the case where an abnormality
occurs in the drone as a flying robot, it is determined whether or not flight can be
continued and, if it is difficult, the flying robot returns to the ground while search-
ing for a safe place. Such mission is included and it hence corresponds to high-level
autonomous flight that requires an advanced, momentary determination. In a manned
aerial vehicle, it is an advanced technology that is carried out by the pilot. However,
in a UAV, the computer needs to do everything: recognition of the three dimensional
space that change in sequence; and momentary determination of the flight route and
the altitude. In the current state, flight is carried out with no or little guidance. In
this sense, the most important, urgent technological issue of drone is to implement a
guidance function. The guidance function has the following two sections, a section
to be implemented by the drone itself and a section to be carried out by the ground
support system such as the UTM. Most of the functions of the guidance, i.e., recog-
nition, intelligence, and determination, are expected to be realized by applying AI in
future. Then, a total system will be achieved in which flying robots are brought into
a network and connected also with the ground support system, and a manned aerial
vehicle and an unmanned aerial vehicle recognize each other when flying. Most of
the current UAVs capable of autonomous flight have realized a basic autonomous
flight referred to as the waypoint flight by two systems of the navigation system and
the control system without the guidance system. Where the guidance corresponds to
the cerebrum of human cerebrum, the navigation and the control correspond to the
small brain of human cerebellum, which controls the equilibrium sense and the motor
function. The advanced navigation system redundantly includes laser, ultrasonic sen-
sors, infrared ray sensors, single and stereo cameras, 3D cameras, vision chips, and
so on, carries out mapping and obstacle detection, and improves the accuracy of
localization as self-positioning estimation. Regarding the drone autonomy (safety)
presented in Fig. 3.12, an example of method to realize Class A of the bioinspired
flight is thought to be the structure as in Fig. 3.18. Figure 3.18 is a chart in which
Fig. 3.11 is described in detail, which presents the contents of the three elements of
Fig. 3.18 Realization of autonomy Class A by supervisor (guidance system) [7]

98 K. Nonami et al.
the guidance (G), the navigation (N), and the control (C). The supervisor corresponds
to G, which determines whether it is GPS/INS navigation or visual SLAM naviga-
tion, changes the structure of the control system as necessary while carrying out an
exact environment recognition and momentary determination regarding every event
encountered during flight, and perfectly carries out the mission while generating a
target trajectory in real time. The fault tree analysis (FTA) estimates what is going on
from the difference between the real-time identification model and the ideal model.
The abnormality diagnosis of the flying robot is carried out thus by obtaining a dif-
ference between the flight model system identification and the ideal model during
flight and by determining whether the difference falls within the permissive range.
All of them are the role of the supervisor, i.e., the guidance G. Regarding crisis man-
agement, the encounter with crisis is learned by AI in advance, and whether or not
to carry out the mission is determined with matching the degree of danger. Troubles
during flight include various events, how to send off an alert signal at the time of these
abnormalities has been learned in advance by sufficient AI learning. Unless there is
a special abnormality, the flying robot gets to the destination while highly accurately
recognizing a three-dimensional environment using the vision sensor of the naviga-
tion. It is expected that the flying robot encounters a sudden change of weather and
a gust during the flight, but for each time, with the control system structure being
variable, it flies to the destination with the top priority given to the efficiency in
normal times meanwhile with the top priority given to the absolute stability in times
of unexpected disturbance and so on.
3.1.7 Conclusions
As stated in Sect. 3.1.3, disaster search and rescue flying robots have reached almost
practical stage. And drones for information gathering at the time of disaster has
already been deployed to disaster related research institutions, and local fire depart-
ments and police. In the future, by flying every time a disaster occurs, various suc-
cessful cases and failed cases are accumulated, and it is thought that they will become
a database and evolve as a disaster search and rescue drones of even higher perfor-
mance and functions. Especially, if the flying robot secures the autonomy of Class
A, formation or swarm flight becomes easy, and if only a destination is given, the
shortest course while avoiding collision can realize search and rescue mission with
high efficiency.
The drones of Class A of Fig. 3.12, which become next generation industrial-use
drones and might be called as real flying robot, are required to have reliability, dura-
bility, and safety that allow them to repeat a daily flight of about eight hours for about
five years, need less to mention an obvious procedure of daily regular inspection and
components replacement. The flying robot of near future is an advanced intellectual
flying machine and flying robot but still remains a machine. So, the flying robot,
which flies very low where the weather tends to change drastically, has to have a
function capable of respond to abnormal events during flight. The next-generation
industrial-use flying robot has to be a “drone that never crashes” that carries out a
crash landing at a safe place to land that is searched by the flying robot itself before
crashing. In that sense, such an industrial-use flying robot has not yet come out in
the world. However, the flying robot evolves so fast that in 2020s the flying robot
will have the autonomy of Class A.
3.2 Design and Implementation of an Embedded Sound

Source Mapping System with a Microphone Array
on a UAV
3.2.1 Introduction
Recently many disasters such as earthquakes and floods have occurred all over the
world. In such a situation, it is said that the survival rate drops precipitously after 72 h
after a disaster occurs, which is called the “golden 72 h,” and prompt search and rescue
is required. Remote sensing technology using unmanned aerial vehicles (UAVs) is
promising to perform search and rescue tasks in disaster-stricken areas because a
UAV can perform the task even when the traffic is cut off. For such remote sensing,
vision based devices are commonly used. However, visual sensing has difficulties
when people are occluded and/or lighting conditions are poor. Acoustic sensing can
compensate for such limitations. For such an acoustic sensor for UAVs, an acoustic
vector sensor (AVS) is used in the military field [11, 12]. However, it is used to detect
high power sound sources like airplanes and tanks, and it is not designed to find
people in distress, that is, low power sound sources. On the other hand, microphone
array processing has been studied in the field of robot audition. It is also applied to
UAVs. Basiri et al. reported sound source localization of an emergency signal from
a safety whistle using four microphones [13]. They targeted only the safety whistle
with high power using a microphone array installed on a glider, and thus, more
general sound sources including human voices should be detected in a highly noisy
environment. Okutani et al. [14] reported one of the earliest studies for more general
sound detection. They installed an 8 ch microphone array on a Parrot AR. Drone, and
applied multiple signal classification (MUSIC) [15] as a sound source localization
algorithm. Ohata et al. extended their work to support dynamic noise estimation [16],
and Furukawa et al., also made another extension to consider UAV’s motion status
for more precise noise estimation [17]. Hoshiba et al. developed a sound source
localization system based on MUSIC, and gave an outdoor demonstration [18]. In
their system, a water-resistant microphone array was adopted so that the system can
work in all weather conditions, and a data compression method was applied to reduce
the amount of data communication. However, in the reported studies, the following
issues have not been considered yet.
100 K. Nonami et al.
1. Most studies ignored the effect from the inclination of a UAV during flying
because evaluation was conducted in a simulated manner.
2. Most studies focused only on direction-of-arrival (DoA) estimation as sound
source localization, and they did not deal with 3D position estimation.
To solve these problems, a novel sound source mapping system is developed. For
the first issue, a new 16-channel microphone array which considers inclination is
designed. For the second one, a new 3D position estimation method with Kalman-
filter-based sound source tracking is proposed by integrating DoA estimation, infor-
mation obtained from GPS/IMU, and a terrain model. The proposed method can run
on a single-board computer (SBC), which drastically reduces the amount of com-
munication data between the UAV and a ground station, because only low-weight
sound position data is communicated without sending any raw sound data.
3.2.2 Methods
This section describes two main components of the proposed sound source mapping
system; the design of a new 16-ch microphone array and the sound source mapping
algorithm.
3.2.2.1 Design of 16-Channel Microphone Array
The shape and location of the microphone array was firstly considered. Although it
is theoretically known that the diameter of a microphone array should be large to
achieve high resolution, a spherical microphone array [18] was adopted, which was
assembled to the end of an arm of a UAV shown in Fig. 3.19. In this case, propeller
Fig. 3.19 UAV with a

microphone array. A black
sphere is the microphone
array, and the other two are
counterbalance weights
Fig. 3.20 Relationship

between the microphone
array and a world coordinate
system when a UAV is
Microphone array
moving
UAV
Ground
Fig. 3.21 16-channel

microphone array
(microphones are marked as
red arrows)
noise is regarded as directional noise, and a simple MUSIC algorithm can be used to
deal with such noise because the noise can be canceled by ignoring sound sources
originating from propeller directions.
After that, the layout of microphones on the sphere was considered. The target
sound sources are basically located on the ground, and it seems that there is no
problem when microphones are distributed only on the lower hemisphere. However,
a UAV’s inclination should be considered when it is flying. As shown in Fig. 3.20, a
microphone array inclines according to a UAV’s inclination. When microphones are
only on the lower hemisphere, the performance of sound source localization for the
direction of the inclination will degrade. Using a gimbal to keep a microphone array
horizontal is easy to solve this problem, but the payload of a UAV is limited, and
if the weight of a UAV becomes heavier, the flight time becomes shorter. Finally, a
16-channel microphone array (MA16) by installing twelve microphones on the lower
hemisphere and another four on the upper hemisphere was designed, as shown in
Fig. 3.21. In this microphone array, sixteen omni-directional MEMS (Micro Electro
Mechanical Systems) microphones are installed on a spherical body with a diameter
of 110 mm. The coordinates of microphone positions are shown in Fig. 3.22. By
installing microphones on the upper hemisphere, it is possible to perform sound
source localization when a UAV inclines. It could even deal with a UAV that inclines
up to 45◦ , theoretically.
60
30
y [mm]
0 Body
Microphone
-30
-60
-60 -30 0 30 60
x [mm]
60 60
30 30
z [mm]
z [mm]
0 0
-30 -30
-60 -60
-60 -30 0 30 60 -60 -30 0 30 60
x [mm] y [mm]
Fig. 3.22 Coordinates of microphone positions in 16-channel microphone array
Fig. 3.23 RASP-MX

installed in the microphone
array
The weight of MA16 is 150 g including a SBC installed inside MA16 to execute
sound source localization. As a SBC, RASP-MX (Fig. 3.23, System In Frontier Inc.)
developed for sound source localization and separation was selected. The specifica-
tion of RASP-MX is shown in Table 3.2. Because RASP-MX is about the size of a
business card and lightweight, it is suitable for a UAV of which payloads are lim-
Table 3.2 Specification of Size 50 mm × 87 mm × 2 mm

RASP-MX
Weight 34 g
CPU ARM Cortex A9 1 GHz Quad
Memory 2 GB, 533 MHz
Port USB, Wi-Fi, Bluetooth
ited. All microphones of MA16 are connected to RASP-MX in a cascade way, and
recorded acoustic signals are processed. Results of processing are output through
a USB cable using UART (Universal Asynchronous Receiver/Transmitter) protocol
to ensure versatility. For sound source localization in the previously-reported sys-
tem [18], open source software HARK (Honda Research Institute Japan Audition
for Robots with Kyoto University)1 [19] was used on a PC at the ground station,
and thus, the recorded acoustic signals should be sent to the station. In the pro-
posed system, only sound source localization results without any raw audio signals
can be sent to the ground station, and the amount of data communication is dras-
tically reduced. Note that Embedded-HARK is adopted for RASP-MX instead of
using the PC version of HARK. In Embedded-HARK, to reduce computational cost,
mainly four modifications were made as follows: removal of middleware, use of only
single-precision floating operations, optimization for ARM NEON2 (advanced single
instruction multiple data architecture extension), and fast calculation of trigonometric
functions using a look-up table.
3.2.2.2 Sound Source Position Estimation
MUSIC based sound source localization can provide 2D directional information, that
is, azimuth and elevation of a sound source, and the information can be utilized to
localize the position of a sound source.
It is natural to assume that the target sound source to be detected by the UAV with
the microphone array is located on the ground. The detection range of the microphone
array is limited, so the observation can be modeled locally around the UAV. This fact
allows one to assume that the terrain around the UAV can be modeled by a simple
model, e.g. a flat plane.
Under the above assumptions, the position of the sound source on the ground can
be computed by fusing the estimated sound source direction from the UAV with the
position and the pose of the UAV. Figure 3.24 depicts the observation model of the
source on the ground. The model assumes that the terrain can be approximated as a
completely flat plane, and that the UAV hovers stably in one spot.
Given the UAV position and its pose, with the information of the sound source
direction, the estimated sound source position, denoted by x s = (xs , ys )T , can be
modeled as
1 https://www.hark.jp/.
2 https://developer.arm.com/technologies/neon.
UAV with mic.

(xh, yh)
Sound source
direction θ, φ φ Altitude zh
Sound source
Terrain θ
Heading ψ
Estimated sound source (xs, ys) Planar terrain model z ≡ 0
Fig. 3.24 Observation model of a sound source on the ground
x s = x h + z h tan(φ)u + ε, (3.1)
where x h = (x h , yh )T , z h and ψ denote the x − y and z positions, and the heading

direction of the UAV, respectively. u = (sin(ψ − θ ), cos(ψ − θ ))T is the unit vector
pointing to the sound source, where φ and θ represent the observed sound source
direction in elevation and azimuth, respectively. ε expresses the uncertainty induced
by the modeling error of the terrain and the localization error of the UAV. The sound
source localization φ and θ can be uncertain, and the altitude of the UAV z h that
is obtained by using the Global Positioning System also contains uncertainty since
the height from the ground can be affected by the terrain. Assuming that uncertain
quantities ε, φ, ψ and z h independently follow normal distributions of N (0, ε ),
N (0, σφ2 ), N (0, σψ2 ), and N (0, σz2 ), respectively, the sound source position x s can
be approximately modeled as
x s ≈ x h + z̄ h tan(φ̄)ū + w, (3.2)
where ¯· represents the measured value of the quantity, and

w∼N (0, P). P is the
σφ z¯h 2 T
covariance matrix given by P = (σz tan φ̄) + ( cos2 φ̄ ) ū ū + σθ z¯h tan φ̄ v̄ v̄ T
2 T
+ ε , where v = (cos(ψ − θ ), − sin(ψ − θ ))T .

Incorporating a dynamic model of the UAV with (3.2), probabilistic state estima-
tion can be utilized to fuse measurements in order to obtain the estimated sound source
position. Based on this approach, Wakabayashi [20] applied the extended Kalman
filter to realize the estimation of multiple targets by introducing a data association
algorithm based on the global nearest neighbor approach [21] and a multiple-filter
management system. Washizaki [22] introduced 3D triangulation to localize a sound
source from a sequence of observations without a terrain model.
3.2.3 Evaluation
To evaluate the performance of the developed system, experiments using numerical

sound simulation were performed.
3.2.3.1 Experiments
To evaluate localization performance, acoustic signals arriving from every direction

were created using transfer functions corresponding to the microphone array and
sound samples. For comparison, a 12-channel microphone array which has micro-
phones only on the lower hemisphere (MA12) was used. Transfer functions for the
MUSIC method were derived from geometrical calculations. A human voice was
used as a target sound sample. Recorded noise of an actual flying UAV was added to
the simulated signals. As the UAV, MS-06LA (Autonomous Control Systems Labo-
ratory) was used. Sound samples were recorded at a sampling frequency of 16 kHz,
and a quantization bit rate of 24 bits. Spectrograms of sound samples are shown in
Fig. 3.25. The direction of the simulated sound source was set at every 5◦ in the
azimuth range from −180◦ to 180◦ and the elevation range from −90◦ to 45◦ (the
azimuth range was set from −90◦ to 90◦ in the elevation range from 0◦ to 45◦ ) at a
distance of 10 m from the microphone array. Simulated signals were created in dif-
ferent signal-to-noise ratios (SNRs) from −20 to 5 dB. These signals were processed
by the original version of HARK and Embedded-HARK.
Table 3.4 shows three combinations of microphone arrays and software types,
and Table 3.3 shows the parameter values of SEVD-MUSIC, which were used in the
experiments. To show the MUSIC spectra, the polar coordinate system as shown in
Fig. 3.26 was used. In these coordinates, the origin is the center of the microphone
array, and the x-axis shows the opposite direction to the center of the UAV, which
is defined as 0◦ in azimuth. The radius represents elevation, and the circle marked
as 0◦ with a red rectangle shows the horizontal plane of the microphone array. This
means that a sound source observed inside this circle originates from the downward
direction of the microphone array. The ranges between the semi-circle marked as
45◦ and the right semi-circle marked as 0◦ are assigned to the upward direction of
the microphone array. The left semi-circle for 45◦ is missing because sound sources
from the UAV’s directions are ignored. Since all four microphones assigned to the
upper hemisphere of the microphone array are aligned only in the opposite direction
of the UAV, localization performance for sound sources in the direction of UAV is
poor, and thus they are excluded in localization.
3.2.3.2 Results
First, results of sound source localization are described. Figure 3.27 shows the
MUSIC spectra in SNR of −15 dB. In the MUSIC spectrum, the relative power
Fig. 3.25 Spectrograms. a (a)

Voice, b UAV’s noise
(b)
Table 3.3 Parameters of TR 50

SEVD-MUSIC
L 2
ωL 1 kHz
ωH 3 kHz
Period of processing 0.5 s
Table 3.4 Experimental Microphone array Software (device)

configuration
(i) MA12 HARK (PC)
(ii) MA16 HARK (PC)
(iii) MA16 Embedded-HARK (RASP-MX)
of the sound in each direction is shown on a color map. The target sound is located
around θ = −45◦ , φ = −30◦ (Fig. 3.27a), θ = −30◦ , φ = 20◦ (Fig. 3.27b). (i), (ii)
and (iii) correspond to Table 3.4. Normalized power in each direction is depicted
by a color map. In every MUSIC spectrum, the noise power of the UAV can be
seen around the azimuth angle of 180◦ . In Fig. 3.27a(i) and a(ii), the target sound
source power can also be seen. As these two figures show, sound source localiza-
tion could be performed using both MA12 and MA16 when the target sound is
located within the negative range of the elevation angle. As shown in Fig. 3.27a(iii),
Fig. 3.26 Coordinate y

system of MUSIC spectrum
Azimuth angle
Elevation angle
(i) (ii) (iii)

(a) 1 1 1
0.8 0.8 0.8
0.6 0.6 0.6
0.4 0.4 0.4
0.2 0.2 0.2
0 0 0
(b) 1 1 1
0.8 0.8 0.8
0.6 0.6 0.6
0.4 0.4 0.4
0.2 0.2 0.2
0 0 0
Fig. 3.27 MUSIC spectra. a θ = −45◦ , φ = −30◦ , b θ = −30◦ , φ = 20◦ . (i) MA12 + HARK,
(ii) MA16 + HARK, (iii) MA16 + Embedded-HARK
even when Embedded-HARK was used, a similar peak corresponding to the sound
source was observed. On the other hand, in the case where the target sound is located
within the positive range of the elevation angle, the peak could not be observed when
MA12 was used as shown in Fig. 3.27b(i). Using MA16, the peak was successfully
observed because of four additional microphones as shown in Fig. 3.27b(ii) and b(iii).
Our results confirmed the effectiveness of MA16 and Embedded-HARK for sound
source localization on a UAV.
3.2.3.3 Discussions
Based on the experimental results, localization performance was evaluated by its

success rate. It is defined that sound source localization succeeded when the estimated
azimuth and elevation were exactly the same as the ground truth values. The number
of successes of all simulated sounds was counted and then the success rate was
calculated. In the experiments, transfer functions were calculated at 5◦ intervals.
Because the distance to the sound source was 10 m, the resolution of sound source
π
localization is around 0.9 m (10 ×5 180 ). The success in localization means that
a sound source is localized with ±0.45 m error allowances. In our experience, the
maximum distance to be localized is around 20 m. In this case, the allowances are
±0.9 m, and it would be still acceptable for a search and rescue task. In other words, it
is assumed that the success rate shows if a sound source is localized to be acceptable
for a search and rescue task.
Figure 3.28 shows success rates of localization using MA12 and MA16, processed
by HARK and Embedded-HARK. The success rate was approximately 100% in any
combinations under higher SNR than −5 dB. However, when the SNR was lower
than −15 dB, the success rate of ‘MA12 + HARK’ decreases drastically. The system
generally deals with the case where SNR is around −15 dB, because it is a common
SNR obtained from our analysis of the recorded acoustic signals. In this case, the
success rate of ‘MA12 + HARK’ is approximately 34%. It increases 56–61% in the
cases of ‘MA16 + HARK’ and ‘MA16 + Embedded-HARK’. It can be said that the
increase in the number of microphones (12–16) was effective to improve the per-
formance of sound source localization. Embedded-HARK is specialized to ARM®
processors, for example, the use of NEONTM which is an extension of SIMD (Sin-
gle Instruction Multiple Data) instruction for ARM processors. It basically uses the
same algorithm as the original version of HARK, and thus the performance between
‘MA16 + HARK’ and ‘MA16 + Embedded-HARK’ is comparable. As reported in
[23], the limitation of the detection area in the MUSIC spectrum further improved
the localization performance. A dotted red line in Fig. 3.28 marked as ‘MA16 +
Embedded-HARK (angle limitation)’ shows the result when the detection angle is
limited to −90◦ ≤ θ ≤ 90◦ for azimuth, and to −90◦ ≤ φ ≤ 35◦ for elevation. By
eliminating the directions which noise sources mainly originate from, the success
rate increased to approximately 74%. Compared to ‘MA12 + HARK’, the 40-point
increase in the success rate was finally obtained in the case of −15 dB. The angle lim-
Fig. 3.28 Success rate of

localization
MA12 + HARK
MA16 + HARK
MA16 + EmbeddedHARK
MA16 + EmbeddedHARK (angle limited)
1
74%
0.8
0.5
Success rate 0.6
0.4
34%
0.2
0
-20 -15 -10 -5 0 5
SNR [dB]
itation actually has another effect that it can reduce the number of transfer functions,
then real-time processing could be ensured.
Using this developed system, a demonstration in an actual outdoor environment
was performed (Fig. 3.29). In the demonstration, a task to detect an occluded person
was performed in a field simulating a disaster-stricken area. Figure 3.29a demon-
strates an overview of the area. Figure 3.29b illustrates a 3D map, and a white object
and a blue circle in the map represents a drone and a sound source, respectively.
Figure 3.29c shows person B in a pipe who corresponds to a person in distress.
Figure 3.29d depicts results of DoA estimation. The angle and radius indicate azimuth
and elevation of MUSIC spectrum in the UAV’s coordinates. A white circle shows
that a sound source for person B is localization. Figure 3.29e exhibits 3D sound
position estimation results in a top view map. Red circles are sound source position
candidates, and blue circles are finally estimated sound source positions which cor-
respond to the blue circles shown in Fig. 3.29b. The black line represents a trajectory
of the UAV. Even in an actual outdoor environment, an occluded person was able to
be localized with high accuracy in real-time. As our results show, by using MA16,
Embedded-HARK and angle limitation, the system could provide highly accurate
localization in real-time, and it is confirmed that the developed system has usability
for search tasks in disaster-stricken areas. In addition, a guideline about parameters
such as microphone positions was obtained to design a sound source localization
system for a UAV according to the situation.
(a) Video Camera 1 Drone (b) 3D Map
Person
P on A
Person B
P
(c) Video Camera 2 (d)DoA Estimation (e)Estimated positions
Fig. 3.29 Demonstration
3.2.4 Conclusions
In this paper, an embedded sound source localization system with a microphone

array on a UAV was developed for the detection of people in disaster-stricken areas.
First, a novel spherical microphone array consisting of sixteen microphones was
designed to ensure highly accurate localization when a UAV is inclined. For executing
a sound source localization algorithm on a UAV, a SBC was put in the body of
the microphone array. Second, to deal with 3D sound source localization, a new
3D position estimation method with Kalman-filter-based sound source tracking was
proposed by integrating the direction of arrival estimation, information obtained from
GPS/IMU, and a terrain model. Evaluation experiments showed that the developed
system provides higher accuracy in real-time sound source localization even when a
UAV inclines. It is confirmed that the developed system has usability for search tasks
in disaster-stricken areas. However, since there are several sound sources at an actual
site, it is necessary to separate and identify human-related sounds from recorded
sounds. In future work, the separately-proposed sound source classification method
using deep-learning [24–26] will be integrated to the system.
Acknowledgements We thank Masayuki Takigahira, Honda Research Institute Japan
Co., Ltd. for his help. This work is partially supported by JSPS (Japan Society
for the Promotion of Science) KAKENHI Grant Nos.16H02884, 16K00294, and
17K00365, and also by the ImPACT (Impulsing Paradigm Change through Disruptive
Technologies Program) of Council for Science, Technology and Innovation (Cabinet
Office, Government of Japan).
3.3 A Study on Enhancing the Agility and Safety

and on Increasing the Flight Duration of a Multiple
Rotor Drone
3.3.1 Introduction
To improve the robustness and reliability of flying robots during various applications,
the limitations of the flight performance under severe weather conditions such as
strong wind and gust must be expanded and verified. Three means have been proposed
to improve the control responses of the multiple rotor drone: the first is to replace the
flight control with variable pitch angles of the rotors from conventional rotational
speed control; the second is to use two motors at the center of the aircraft to drive all
rotors, thus reducing the attitude-changing inertia of the whole aircraft; the third is
to design an optimal duct for each rotor, thus improving the rotor efficiency. Besides
the above-mentioned three techniques, the flight performance of the multiple rotors
near walls is studied through numerical simulations using advanced computational
fluid dynamics (CFD) techniques. For the flight demonstrations of these unique
techniques, the flight dynamics modelling of the multiple rotor drones and estimation
of distance to wall is discussed along with implementation of advanced autonomous
flight systems to the prototypes of these new techniques. Variable pitch control is a
common feature to control a conventional helicopter where the main rotor pitch angle
changes simultaneously in a harmonic wave form with the rotor rotating through a
swashplate, while the tail rotor is controlled with the collective pitch angle only.
There are several existing variable pitch-controlled multiple-rotor drone prototypes
already but in this research an integrated and compact module of pitch variable rotor
is proposed together with different sources of rotor drives.
3.3.2 Multiple Rotor Drone with Variable Pitch Controls [27]
Almost all multiple rotor drones are controlled by changing the rotation speed of
rotors. The “MR. ImP-1 (Multiple Rotor drone ImPACT-1)” was developed together
with Taya Engineering Corp., employs a variable pitch angle mechanism for control-
ling its thrust to achieve an agile flight control (Fig. 3.30). Coincidently, a group of
manufacturing companies also launched a joint development of a variable pitch con-
trol multiple rotor drone and public announcements of their project and MR. ImP-1
were made almost at the same time, in April 2016. The MR. Imp-1’s rotor is driven by
a brushless DC motor. A servomotor below the main motor is connected to the root
of the blades. The collective pitch can be varied by the servomotor while the rotor
rotates. As the moment of inertia around the axis of the blade pitch is much smaller
than that of around the revolution axis of the rotor, the servo motor completes the
change in the pitch angle of the blades in a very short time. Hence, the flight agility of
© The Japan Society for Aeronautical and Space Sciences
Fig. 3.30 Photo of MR. ImP-1 and schematic sketch of the variable pitch control rotor [28]
MR. ImP-1 was expected to be higher than that of traditional multiple rotor drones.
Moreover, quick changes in thrusts enables MR. ImP-1 to fly stably even under
severe disturbed wind conditions. To achieve high agility, the shape of MR. ImP-1’s
rotor blade was also improved. The blade has a symmetrical airfoil shape called
NACA 0009 and no twist along the span. The blades can generate downward and
upward thrusts symmetrically by changing the sign of the collective pitch angle.
Thanks to the modified blade shape, MR. ImP-1 can control its attitude by using
a large control moment and even deal with the unstable descending flight called
“windmill brake state”, “vortex ring state”, or “turbulent wake state” by impulsively
changing the descending rate. The parameters of the rotor and blade are detailed in
Sect. 3.3.6. A prototype of the variable pitch control rotor was experimentally tested.
In the experiment, the thrust is changed between 5 and 10 N in two ways: first by
changing the revolution speed with the fixed collective pitch and second by changing
the collective pitch with the fixed revolution speed. The control signals of the col-
lective pitch and the rotor revolution speed were impulsively changed. In Fig. 3.31,
time histories of thrust measurements are presented whilst increasing and decreasing
the thrust. The results demonstrated that the duration of the thrust change is less than
0.03 s during both the thrust increase and decrease for the case of the collective pitch
control. For the rotor speed control, in contrast, the durations are approximately 0.2 s
during thrust increase and 0.4 s during thrust decrease. The undulation of the curve
for the change of collective pitch when t > 0.5 s is observed in the figure. However,
this undulation does not affect the conclusion, because this undulation is presumably
caused by the vibration of the structure of the experimental apparatus. The exper-
iment demonstrated an improvement in the responsivity achieved by the collective
pitch control technique.
3.3.3 Multiple Rotor Drone with Concentrated Drive Motors
For more agility, the “MR. ImP-2 (Multiple Rotor drone ImPACT-2) ” shown in
Fig. 3.32 was developed in cooperation with Taya Engineering Corp. Unlike the
MR. ImP-1 that has six motors at the tips of the support arms to drive each rotor, the
Fig. 3.31 Time history of thrust during controlling thrust values between 5 and 10 N [28]
Fig. 3.32 Photo of MR. ImP-3
MR. ImP-2 drives six rotors using only two motors that are placed near the center of
the fuselage. Three rotors at every other position are connected to the same motor via
belts and rotate synchronously. To cancel out the torque around the z-axis, the three
rotors rotate counterclockwise and the others rotate clockwise (Fig. 3.33). In addition,
the six rotors are inverted to reduce the air drag acting on the support arms. As the
reduction of motors from six to two makes rotors rotate synchronously, the flight
control of MR. ImP-2 is achieved by employing the variable pitch control technique
developed in MR. ImP-1. The most promising aspects of the improvements are as
follows. The first advantage is that the maneuvering technique for this MR. ImP-
2 can generates the control moments without changing the consumed power. For
example, to rotate the fuselage around the roll (X -) axis, the three rotors in Y < 0
region increase thrusts and the three in Y > 0 decrease. As long as the sum of the
changes in torques generated by three rotors driven by a same motor is zero, the
maneuver generates the rolling moment with the constant power consumption. In
addition, the yawing moment is cancelled. The second advantage is the change in
moment of inertia. As the mass of the rotor is lower than that of MR. ImP-1, the
Fig. 3.33 Rotation direction

of rotors
Table 3.5 Comparison of total mass and moments of inertia between MR. ImP-1 and MR. ImP-2
MR. ImP-1 MR. ImP-2
Mass without batteries (kg) 3.5 3.2
Moment of inertia around x-axis (kg m2 ) 0.20 0.095
Moment of inertia around y-axis (kg m2 ) 0.20 0.089
Moment of inertia around z-axis (kg m2 ) 0.37 0.16
moments of inertia around the three axes of MR. ImP-2 are also smaller than those
of MR. ImP-1 as summarized in Table 3.5. It also improves the angular acceleration
and the agility of MR. ImP-2.
3.3.4 Guidance Control of Variable-Pitch Multiple Rotor

Drone
A reference model following model predictive control (RMFMPC) is applied to the

guidance control of the variable-pitch multiple rotor drone to realize good track-
ing performance and robustness. Figure 3.34 shows the block diagram of the whole
control system. In designing, a mathematical model of the control object, reference
model, and evaluation function is necessary.
First, the mathematical model of the controlled object is derived. The attitude
control system shown in Fig. 3.34 is considered as a controlled object. An input of
the system is attitude command, and outputs are position and velocity for each axis.
Now, a state equation is obtained as follows:
ẋ = Ax + Bu at (3.3)
T
Here, x = a v vgps p is the state vector, a is the acceleration, v is the velocity,
vg ps is the velocity considering the GPS delay, and p is the position. Also, u at
represents the attitude command value.
Fig. 3.34 Control system of variable-pitch multi-rotor helicopter
Considering the order and structure of the mathematical model of controlled

object, the reference model which shows an ideal response to a reference value is
derived as follows:
ẋ r = Ar x r + Br r (3.4)
Here, x r ∈ R 4 represents the state of the reference model, and r ∈ R represents the
reference value respectively.
RMFMPC is designed by using the derived mathematical model and the reference
model. First, the error state is defined as following equation.
e ≡ x − xr (3.5)
By differentiating (3.5) with respect to time and substituting (3.3) and (3.4), the next
equation is obtained.
ė = ẋ − ẋ r = Ar e − ( Ar − A)x − Br r + Bu at (3.6)
Here, using the matrices K 1 and K 2 that satisfy the model matching condition Ar −
A = B K 1 and Br = B K 2 , the control input of model following control is given as
u at = K 1 x + K 2 r + u b (3.7)
Using the control input, (3.6) could be transformed as follows:
ė = Ar e − ( Ar − A)x − Br r + Bu at = Ar e + Bu b (3.8)
Next, the control input u b which stabilizes the error system (3.8) is designed on
the basis of based on the model predictive control. Now, the evaluation function up
to the finite time T seconds is defined as follows:
t+T
J = Φ(ē(t + T )) + L(ē(τ ), ū b (τ ))dτ (3.9)
t
Φ(ē(t + T )) = ē (t + T )Sē(t + T )
T
(3.10)
L(ē(τ ), ū b (τ )) = ēaT (τ ) Q ē(τ ) + R ū b (τ ) 2
(3.11)
Here, ē and ū b represent predicted values of error states and inputs within the pre-
diction interval. The first and the second terms on the right side are the terminal
cost and the stage cost, respectively, and S, Q, and R represent the weighting matri-
ces. From the above, the model predictive control is reduced to finding the optimal
input sequence that minimizes the evaluation Eq. (3.9) under the dynamics (3.8). The
optimization problem is shown as follows:
Minimi ze : J
˙ ) = Ar ē(τ ) + B ū b (τ )
ē(τ
Subject to : (3.12)
ē(τ ) |τ =0 = e(t)
By solving this optimization problem, it is possible to obtain the optimum input

sequence within the prediction interval. Then substituting the initial value of the
optimum input sequence into (3.7), RMFMPC is realized.
3.3.5 Ducted Rotor [29]
It is important to increase the flight duration and distance because the aerodynamic
efficiency of the rotor is directly affected [30, 31]. Ducted rotors are adopted to
increase the efficiency of tail rotors of many helicopters. In the present study, the duct
contour was designed using flow simulation and tested using variable collective pitch
rotors. The experimental apparatus and the dimensions of the duct are presented in
Fig. 3.35. The thrust coefficient and the figure of merit, which is based on the electric
power consumption, are shown in Fig. 3.36. Comparing the results, the thrust and the
figure of merit of the ducted rotor are observed to be larger than those of the open
Fig. 3.35 Schematic aerodynamic test apparatus (top) and duct dimensions (bottom). Test stand is
located upstream [29]
Fig. 3.36 Thrust coefficient and figure of merit versus rotor collective pitch angle with ducted
rotor [29]
rotor. Prototypes of the duct were manufactured of thin CFRP parts for the flight. It
was confirmed that these advantages could be guaranteed even if the weight of the
ducts were considered.
3.3.6 Flight Performance Changes Near Walls
One of the promising applications of drones is the inspection and observation of

existing social infrastructures such as bridges, industrial plants and tunnels. As such
missions can require the drone to fly near walls, how the flight performance and
characteristics are affected by walls must be investigated. A CFD code, rFlow3D,
specially developed for rotorcraft at JAXA [32] is used for the analysis of the aero-
dynamics around a multiple rotor drone hovering near a side wall, upper wall, and
inside a tunnel. A prototype drone with six variable pitch-controlled rotors is chosen
as a model multiple rotor drone where the rotor layout is shown in Fig. 3.37. The
CFD solver adopted an overlapping grid method as shown in Fig. 3.38. Note that we
considered only the isolated six rotors and neglected the support arms, fuselage, and
other details. At first, the hexa-rotors hovering near a side wall are studied. Details
can be found in Ref. [33]. The flow field when the gap between the rotor tip and wall
is 0.3 D, (D is the diameter of the rotor) is shown in Fig. 3.39. The wake from the
rotors near the wall goes parallel with the wall, while the wakes from other rotors flow
toward the center of the rotors. Such asymmetric flow fields cause a rolling moment
tilting the drone toward the side wall. The rolling moment becomes stronger when
the gap is below 1.5 times of the rotor diameter. The flow field when the hexa-rotor
drone is hovering near an upper wall is shown in Fig. 3.40. A lower pressure zone
appears between the rotor upper surface and the wall that causes a strong lift to attract
the drone toward the upper wall at an accelerating rate. These results are summarized
in Figs. 3.41 and 3.42. For both the side wall and the upper wall, the wall influence
becomes significant when the gap between the rotor and the wall is below 1.5 times
of the rotor diameter. It is advised to keep the drone from the wall more than this
Hexa-Rotor Design Parameters croot

Rotor diameter, D 0.330 m
Rotor radius, R 0.165 m
R
Chord length croot , ctip ) 0.048 m, 0.029 m
Blade planform Tapered
Blade twist angle 0 deg
Blade airfoil NACA0009
Number of rotors, Nr 6
Number of blades per rotor, Nb 2
Tip Mach number, Mtip 0.29412 (100 m/s)
Reynolds number based on chord length 1.99x105
Distance between rotor, d 0.063 m (0.38R)
Distance between rotor center and 0.393 m
aircraft center, l
Fig. 3.37 A variable pitch controlled hexa-rotor drone design
Inner
Background Grid
Outer Inner
Background Grid Background Grid
Fig. 3.38 CFD grid sample for hexa-rotors
distance to avoid unexpected motion of the drone. In case the drone is required to fly
near the wall below a rotor diameter, precise measurement of the distance to the wall
is recommended and the flight control of the drone should include the feedbacks of
the distance to the side wall for the rolling moment and distance to the upper wall
for the thrust controls.
3.3.7 Induced Velocity Variation Due to Distance to Wall
The dynamics of a drone having rotors is affected by the interaction between the
aerodynamics of rotors and the surrounding structures [33]. The nominal aerody-
namics model is varied, and the performances of the flight control deteriorates near
structures. An additional system to avoid collision is required for safe flight. There-
g=0.30D
g y
x
z
Fig. 3.39 Flow field around hexa-rotors near a side wall
h = 0.5 D
Fig. 3.40 Flow field around hexa-rotors near an upper wall
Fig. 3.41 Rolling moment 0.0020

change of hexa-rotors with
distance from a side wall 0.0015
0.0010 Rolling toward the wall

when is negative
0.0005
0.0000
-0.0005
-0.0010
-0.0015
-0.0020
0.0 0.5 1.0 1.5 2.0
g/D
Fig. 3.42 Thrust change of 0.0550

hexa-rotors with distance
from an upper wall
0.0500
w/o
w/o wall
wall ?0=10
0.0450
0.0400
0.0350
0.0 0.5 1.0 1.5 2.0 2.5 3.0
h/D

apparatus
Hot wire current meter

45
fore, the estimation method of the distance to the wall is proposed. The algorithm
is based on the estimation of the induced velocity of rotor from the drone responses
sensed by the inertial sensors. Then, the correlation between the induced velocity
and the distance to the wall is used for the distance estimation. Firstly, the dynam-
ics of a quad-rotor drone is modeled [34]. The aerodynamic force generated by a
rotor was formulated combining blade element theory and momentum theory. The
induced velocity is a fundamental parameter determining the aerodynamic force. The
variation of the induced velocity results in the disturbance of aerodynamic force and
moment added to the drone motion. An experiment was conducted to investigate the
induced velocity variation near the wall (Fig. 3.43). The drone model was set near the
Fig. 3.44 Induced velocity distribution of rotor 1 (4800 [rpm])
wall with the altitude of 0.9 m. Heading angle to the wall is constant as Fig. 3.43. The
rotational speed of rotor was maintained at 4800 rpm of hovering condition. The inlet
and outlet wind velocities of the rotor were measured by a hot wire current meter.
The measured induced velocities, which is the average of the measured inlet and
outlet velocities, are plotted with a parameter of the distance to the wall (Fig. 3.44).
The theoretical induced velocities are also plotted. Although the measured value
is different from the theoretical value, the induced velocity decreases as the rotor
approaches the wall. Based on the experimental results, the induced velocity vari-
ations are modeled as a function of the distance to the wall. The average value of
the measured induced velocity variation from the induced velocity without a wall is
approximated as Eq. (3.13) for d/R0 = 2.63 ∼ 6.14. This model is provided for the
estimation algorithm of the distance to the wall.
2
d d
Δvi = −0.0520 × + 0.926 × − 3.78 (3.13)
R0 R0
3.3.8 Estimation of Distance to Wall
The disturbance observer estimating moment variations was designed for the pitch
and roll dynamics of drones. The observer gain was determined so that the H∞ norm
of the transfer function from the input to the disturbance model to the estimation
error was minimized. The estimated moment variation is applied for the calculation
of the induced velocity variations. The distance to the wall can be obtained from
the estimated induced velocity variation and Eq. (3.13). The estimation method of
distance to wall was validated by numerical simulations. The nonlinear dynamics
of the drone [34] containing an induced velocity variation model (Eq. (3.13)) was
used. In this simulation, it is assumed that the motions of altitude and yaw angle are
stabilized by the controller, and the drone dynamics is stabilized by a PID controller
having appropriated gains. The induced velocities were estimated from the angular
rate by the designed observer. The initial conditions are that the distance to the wall
is 0.7 m and the drone speed to the wall is 0.055 m/s. The estimated values of the
distance to the wall agree with the true values, although the drone moves to the
wall involving the induced velocity variations. Our proposed system requires only
an inertial sensor. This leads to huge advantages for the implementation to the flight
model. The proposed algorithm can be applied for the specific flight condition, where
the induced velocity model was formulated. However, because the fundamental per-
formance was confirmed in the present research, the practical distance estimating
system can be constructed from the induced velocity model which is formulated for
wide flight condition.
3.3.9 Conclusions
To improve the robustness and reliability of multiple rotor drones, the variable pitch
angle control mechanism for rotors, which is the system that drives six rotors with two
motors, and guidance control methods were proposed and investigated with several
approaches. For the developed multiple rotor drones, the guidance control method
employing the model following and model predictive control method in combina-
tion was formulated. In addition, using ducted rotors for the multiple rotor drones
was also proposed. The variable pitch control method improved the responsivity of
thrust control and increased the aerodynamic efficiency when it was combined with
the optimized duct. To deal with more severe flight condition, the change in the
flight performance near walls was experimentally and numerically studied. These
techniques will be demonstrated on an actual flight test soon.
Acknowledgements The works of Sects. 3.3.7 and 3.3.8 were supported partially by
JSPS KAKENHI Grant Number JP 16H04385.
3.4 Application of a Robot Arm with Anti-reaction Force

Mechanism for a Multicopter
Abstract
Installing a robot arm and hand on a multicopter has a great potential and can be
used for several applications. However, the motion of the arm attached to the mul-
ticopter could disturb the balance of the multicopter. The objective of this research
is to address this issue using mechanical methods. An arm with four joints and three
actuated degrees-of-freedom is developed, which can be attached to an off-the-shelf
(a) (b) (c) (d)
help
(e) (f) (g)
Fig. 3.45 Scope of the applications
multicopter. The experimental results show that the mechanism is effective in reduc-
ing the influence of arm motions that disturb the balance of the multicopter. As
application examples, this study shows that the developed multicopter can grasp a
1 kg water bottle in midair, carry emergency supplies, install and retrieve sensors
such as cameras on a wire, and hang a rope ladder. Moreover, the multicopter can
land/take off so that the birds stay on the branches of trees.
3.4.1 Introduction
Installing a robot arm and hand on a multicopter has a great potential; however, the
motion of the arm may disturb the balance of the multicopter. Moreover, for the
wide use of robot arms and hands in multicopters, it is important to be able to use an
off-the-shelf multicopter. Studies have been conducted on mounting a robot arm and
hand on a multicopter [35–37]; however, these follow a different concept from the
one proposed in this study. Moreover, in previous studies, the robot arm and hand
cannot be used for all the applications described below.
The objective of this research is to realize a multipurpose robot arm and hand that
can be installed on off-the-shelf multicopters that can be used to perform tasks in
midair. Figure 3.45 shows the scope of this research. For people who want to evacuate
from a roof during a disaster, the multicopter can support evacuation by hanging a
rope ladder, as shown in Fig. 3.45a. Moreover, it can transport emergency supplies,
etc. without requiring landing, as shown in Fig. 3.45b. For example, radio equipment,
etc. can be delivered to victims, which will allow victim requests to be communicated
to the rescue team and the rescue team will be able to obtain useful information to
determine priority during the rescue operation. In addition, as shown in Fig. 3.45e,
adding and removing cameras and sensors is possible to monitor disaster situations.
Figure 3.45f, g shows that the multicopter can carry a remote robot to a danger
zone to inspect high-altitude locations. By using the hand as a foot, the multicopter
can land so that birds stay on the branches of trees, as shown in Fig. 3.45c, d. As a
result, the landing range can be widened and applications to logistics in narrow areas
can be possible, as shown in Fig. 3.45c. Figure 3.45d shows that if the multicopter
is equipped with a camera, it can be used as a fixed point camera at high-altitude
locations, which can be used in security and measurement fields.
As application examples, this study shows that the developed multicopter can
grasp a 1 kg water bottle in midair, carry emergency supplies, add and remove sensors
such as cameras on wires, and hang a rope ladder. Moreover, it is possible for the
multicopter to land/take off so that the birds stay on the branches of trees.
The remainder of the paper is organized as follows. Section 3.4.2 describes the
mechanical concept of the proposed arm. Section 3.4.3 shows the developed robot
arm and hand while Sect. 3.4.4 evaluates the effectiveness of the developed arm.
Section 3.4.5 describes the applications of the multicopter, and Sect. 3.4.6 concludes.
3.4.2 Concept
When a robot arm and hand are installed on a multicopter to perform tasks in midair,
the motion of the arm disturbs the balance of the multicopter, as shown in Fig. 3.46.
This problem should be solved to be able to install it on an off-the-shelf multicopter.
Therefore, this research focuses on the configuration of the arm mechanism. There are
three major reasons why the arm motion disturbs the balance of the multicopter. The
first is the deviation of the center of gravity from the center, as shown in Fig. 3.46a.
The second is the counter torque of the arm, as shown in Fig. 3.46b, and the third is
the couple of forces, as shown in Fig. 3.46c. To address these problems, methods such
as adding weight, attaching a slider to adjust the position of the center of gravity, and
canceling the counter torque by the rotor are used. There is an existing configuration
that prevents the effect of the motion of the arm from affecting the motion of the
multicopter completely. However, these reaction restraint mechanisms need to be
added, which increases the weight of the arm.
To prevent the arm from becoming heavy, a mechanism is adopted that cancels
the adverse effects in (a) and (b). To cancel the motion in (a), a slider is mounted on
the developed arm. The slider can adjust the center of gravity of the arm so that it
stays at the center, as shown in Fig. 3.47. In order to adjust the center of gravity over
a wide range, it is preferable that the mass of the slider be large. In other words, the
heavier the slider, the wider the workspace of its end-effector. To reduce the adverse
effect in (b), the mass moment of inertia of transmission components is adjusted
(a) (b) (c)

Center line
Multicopter
Center of gravity
of a component
of the arm
Center of gravity
of the whole arm
Fig. 3.46 Aerial manipulation problem
Fig. 3.47 Robot arm with a

slider
slider
Center of gravity
Center line
so that the angular momentum is designed to be approximately zero. Details of the

design method are described in [38].
3.4.3 Developed Robot Arm and Hand
Figure 3.48 shows the developed robot arm and hand installed on a multicopter. The
multicopter was manufactured by Dà-Jiāng Innovations Science and Technology Co.,
Ltd. (DJI). Its flight controller is the Wookong-M system, which was also manufac-
tured by DJI. The mass of this system without battery is 6.4 kg. The arm has four
joints and three actuated degrees-of-freedom. The mass of the arm without battery
is 2.7 kg. Its motor was manufactured by Maxon Motor (Maxon). Its output is 50 W.
The motors are controlled by the motor driver ESCON Module 50/5 (Maxon). When
these motors were operated under maximum continuous current, the arm can lift an
object weighing 4.5 kg.
At the end of the arm, a robot hand was installed. Its mass is 0.25 kg. Its motor
is EC20-flat (Maxon) with output of 5 W. The motor is controlled by the same
motor driver as the arm. A load-sensitive continuously variable transmission [39]
was installed on this hand. The transmission can realize quick motion during open-
ing and closing operations, and can achieve a firm grasp when the fingers come into
contact with the object to be grasped. It can generate a large fingertip force of 50 N.
Therefore, it can crush an aluminum can, as shown in Fig. 3.49.
skid
250 mm skid
camera
Fig. 3.48 Robot arm and hand installed on a multicopter
Fig. 3.49 Developed hand crushing an aluminum can
A camera was installed on the skid. When the multicopter flies, the skid opens
and lets the camera record a video of the arm and hand, as shown in Fig. 3.48. The
video is transmitted by a wireless communication system in the 2.4 GHz band. An
operator can operate the arm and hand by watching this video.
The arm and hand are controlled by three SH2-7047F microcomputers, which
were manufactured by the Renesas Electronics Corporation; operators send opera-
tional commands to these microcomputers using a wireless transmitter. These oper-
ational commands can also be sent via cable communications from a PC, which is
on the ground. In this case, the state variables of the robot can be stored in the PC. In
the experiment in Sect. 3.4.4, the cable communication of a Controller Area Network
(CAN) was used.
3.4.4 Experiment
This section shows that the multicopter does not lose its balance when the arm
is activated by the more detailed analysis of experiment in [38]. The maximum
instantaneous wind velocity was 3.8 m/s in this experiment; and the hand held a
0.5 kg bottle of water. The multicopter hovered using the GPS control mode of the
flight controller. A local coordinate was defined for the multicopter and the hand
(a) Slider is operated (b) Slider is not operated

Local coordinate x [mm]
Local coordinate x [mm]

Position of the hand
Position of the hand

desired value measured value desired value measured value
0 0
-100 -100
-200 -200
0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11
Time [s] Time [s]
Position of the multicopter
Position of the multicopter

Global coordinate y [mm]
Global coordinate y [mm]
Position of the multicopter Position of the multicopter

Global coordinate x [mm] Global coordinate x [mm]
Fig. 3.50 Experimental result
moved 200 mm in the x-direction at a velocity of 250 mm/s. Figures 3.50a and 3.51a
show the experimental results and motion of the multicopter, respectively. It can be
seen that the measured value follows the target value appropriately. A marker was
placed on the multicopter; the marker was taken from the camera on the ground.
From this image, the position of the multicopter on the global coordinate system can
be determined through image processing. In the experimental results of its trajectory,
the starting point of the experiment was defined as the origin. Even if the robot arm
moves, it can be seen that the multicopter hovers at the same position.
For comparison, an experiment without using the slider was performed. In this
case, the center of gravity was not maintained at the center of the multicopter.
Figures 3.50b and 3.51b show the experimental results and motion of the multi-
copter, respectively. This measured value also follows the target value appropriately.
However, when the arm was activated, it was difficult to hover at the same position.
The results show that the multicopter moved 141 mm in the x-direction when using
the slider, and moved 313 mm in same direction without it. Therefore, the multicopter
without a slider moves 2.22 times more, which shows that the proposed mechanism
is effective in reducing the effects of arm motion.
(a) (b) Slier is not operated

Slier is operated
0s 0s marker
bottled water
of 0.5 kg
1.0 s 1.0 s
2.0 s 2.0 s
3.0 s 3.0 s
4.0 s 4.0 s
Fig. 3.51 Snapshot of the experiment
3.4.5 Aerial Manipulation Applications
In this section, applications using the developed robot arm and hand are described.
The multicopter was operated by two operators: one operated the multicopter while
the other operated the arm and the hand. These operations were realized using a
wireless communication system in the 2.4 GHz band. In Sect. 3.4.5.1, the multicopter
MS-06, manufactured by Autonomous Control Systems Laboratory Ltd. was used;
in the other sections, the S900 was used. In Sect. 3.4.5.2, the operator operated the
arm and hand by watching the image obtained from the camera installed on the
multicopter. In the other sections, the operator operated the arm and hand by watching
them directly.
3.4.5.1 Lifting 1 kg Bottled Water
Figure 3.52 shows the flying multicopter successfully lifting 1 kg bottled water. The
multicopter is stable when flying. However, when the hand comes into contact with
the ground with an off-the-shelf flight controller, the multicopter may become unsta-
ble. To overcome this phenomenon, quick manipulation is required; if the operator
observes instability, it is important to immediately move the multicopter upward.
(1) (2) (3)
(4) (5) (6)
Fig. 3.52 Lifting 1 kg bottled water
(1) (2) (3) (4)
Fig. 3.53 Transport of emergency supplies
Even in the successful example in Fig. 3.52, the multicopter has a large inclination,
as shown in Fig. 3.52(2)–(5). Because the hand grasps the bottle quickly and the
operator moves the multicopter upward immediately, this application is successful
despite the use of an off-the-shelf multicopter.
3.4.5.2 Transport of Emergency Supplies
In this case, emergency supplies are contained in a cloth bag with a total mass of
0.7 kg. Figure 3.53 shows the multicopter successfully setting down the emergency
supplies without landing. In an emergency situation, most of the ground area is rough.
Therefore, it is important that the multicopter has the capability to place objects on
the ground without landing. In this experiment, even when the bag came into contact
with the ground, the motion of the hand was not restricted because cloth bags are
deformable. Therefore, the multicopter was able to maintain its balance.
3.4.5.3 Placing/Lifting a Camera on/from a Cable
Figure 3.54 shows the multicopter placing/lifting a camera on/from a cable. In this
application, because the cable can be deformed, the motion of the hand was not
(1) (2) (3) (4)
(5) (6) (7) (8)
Fig. 3.54 Placing/Lifting a camera on/from a cable
restricted. Therefore, the multicopter was stable in this application. The camera was
installed at the pan-tilt stage, which can be controlled by wireless communication;
its total mass was 1 kg, including its battery.
3.4.5.4 Hanging a Ladder
Figure 3.55 shows the multicopter successfully hanging a rope ladder from midair.
The length of the ladder was 5 m, with a hook at the end weighing 0.12 kg; the
total mass was 0.7 kg. However, transporting the rope ladder in the manner shown
in Fig. 3.55(3) is not suitable for long distances because the hook shakes and affects
the flight of the multicopter. To resolve this issue, the rope ladder was fixed with
masking tape, as shown in Fig. 3.55(1); but this tape can be cut by the motion as
shown in Fig. 3.55(2) to be as shown in Fig. 3.55(3). In Fig. 3.55(4), hook was hung;
and in Fig. 3.55(5), the ladder was moved to the appropriate position by the arm.
In Fig. 3.55(6), the hand released the rope ladder and the multicopter with the pro-
posed arm succeeded in hanging it, as shown in Fig. 3.55(7). In this application, the
multicopter was mostly stable because the rope ladder can be deformed and did not
restrict the motion of the hand. However, if the tension of the rope ladder is high, it
will restrict the motion of the hand. Thus, there is a possibility that the multicopter
will become unstable. To avoid this, the operator must be careful about the tension of
the rope ladder. When the tension is high, the multicopter should go down to reduce
the tension. It should be noted that an operation opposite to that in Sect. 3.4.5.1 is
required.
(1) (2) (3)
masking tape
(4) (5)
(6) (7)
Fig. 3.55 Hanging a ladder
3.4.5.5 Perching
Figure 3.56 shows the multicopter grasping a pipe and realizing a perching position.
During perching, the hand motion was restricted, and thus the multicopter became
unstable. Therefore, quick perching motion is required. As shown in Fig. 3.56(2)–
(4), the multicopter can adjust the slider position so that the center of gravity can be
maintained above the rod. Figure 3.56(5) shows that it is possible to stop the propeller.
3.4.6 Conclusions
This paper proposed a mechanical concept of a robot arm and hand for multicopters.
The arm with four joints and three actuated degrees-of-freedom was developed,
which can be installed on an off-the-shelf multicopter. The experimental results
(1) (2) (3)
(4) (5) (6)
Fig. 3.56 Perching
showed that the mechanism is effective in reducing the influence of the motion of
the arm, which disturbs the balance of the multicopter. The multicopter with the
developed arm and hand can be used for applications such as lifting and placing
objects, hanging a rope ladder, and those which require perching capability. In these
applications, two different off-the-shelf multicopters were used, both of which were
able to achieve stable operations; this shows that the proposed robot arm and hand is
versatile. Although the multicopter tends to become unstable when the hand comes
into contact with the ground, these problems could to be solved by incorporating a
compliance mechanism.
Acknowledgements This research was funded by ImPACT Program of Council for
Science, Technology and Innovation (Cabinet Office, Government of Japan). Also,
I thank the many students of my laboratory for helping me develop and experiment
on this robot arm and hand.
3.5 Development of Bio-inspired Low-Noise Propeller

for a Drone
3.5.1 Introduction
While many types of drones can achieve various missions [40], the noise from drones,
and its consequences, may require more attentions. Drone is named partly for its
insect-like noise [41] that may cause annoyance of the people underneath when it is
used in urban area [42]. When equipped with the microphones for the surveillance,
the noise from propellers can reduce the accuracy of the auditory sensing. It is,
therefore, of great importance to reduce the noise from drones as far as possible.
Fig. 3.57 Microstructures on primary feather of an ural owl (Strix uralensis japonica)
The shape of the propeller can have a great impact on the acoustic properties
of drones, because the noise of the drone particularly at high frequency is mainly
generated by the interaction between the propeller and the airflow. In nature, owls
are widely known for their silent flight enabled by their unique wing morphologies
such as leading-edge serrations, trailing-edge fringes, and velvet-like surfaces [43]
as shown in Fig. 3.57. Numerical and experimental studies revealed that the leading-
edge serrations can passively control the laminar-turbulent transition on the upper
wing surface, and can be utilized for the reduction of aerodynamic noise [44, 45].
In this study, with the inspiration by the unique wing morphologies of owls, var-
ious micro/macro structures are attached to the propeller of drones, and evaluated
numerically and experimentally the acoustic and aerodynamic performances of the
propellers.
3.5.2 Materials and Method
3.5.2.1 Propeller Model
The propeller of Phantom 3 (DJI Ltd.) and PF1 (ACSL Ltd.) were employed as the
basic propeller models in this study. Wingspans of the propellers of Phantom 3 and
PF1 are 243 and 385 mm, respectively. Various structures are attached to the propeller
of Phantom 3, and its acoustic and aerodynamic performances were evaluated. The
three-dimensional shapes of these propellers were reconstructed by using a laser
scanner (Laser ScanArm V2, FARO Technologies Inc.) as shown in Fig. 3.58. The
reconstructed surface of a propeller of Phantom 3 is used for the computational
analysis. All the reconstructed points were smoothened by a filter for removing the
measurement errors.
Fig. 3.58 Reconstructed shape of basic propellers for Phantom 3 and PF1
Fig. 3.59 Noise level measurement of a propellers for Phantom 3, b a hovering Phantom 3, and c
propellers for PF1
3.5.2.2 Noise Level Measurement
The noise level of propellers and a drone in hovering was measured by a precision
sound level meter (NL-52, RION Ltd.) for 10 s with the sampling rate of 48 kHz.
The sound was analyzed by the software AS-70 (RION Ltd.) to calculate the overall
sound level from propellers. For the measurement of the noise from a propeller of
Phantom 3, the microphone was located at a distance of 1.0 m from the propeller
along its rotational axis (Fig. 3.59a). The noise from a hovering Phantom 3 equipped
with the propellers developed in this study was measured by a microphone at 1.5 m
in vertical direction and 4 m in horizontal direction (Fig. 3.59b).
While it is ideal to perform the experiments of Phantom 3 and PF1 with the
same setup, the propeller of PF1 was placed vertically for the noise measurement
(Fig. 3.59a) since the setup can avoid the unnecessary oscillation of the system due to
the larger aerodynamic forces from the propeller for PF1. The noise from a propeller
of PF1 was measured by the microphone at 1 m in horizontal and vertical directions

from the propeller (Fig. 3.59c).
In order to keep the vertical force identical to support the drone’s weight and to
estimate the power consumption from the rotational speed and counter torque, the
vertical force from the propeller for PF1 was further measured with the 6-axis load
cell. Note that the propeller was attached to a motor upside down in order to avoid
the ground effect. All of the measurement was conducted indoor.
3.5.2.3 Numerical Simulation
In order to investigate the flow fields around the propeller and estimate the aerody-
namic performance of the propellers, the numerical simulation was performed for a
single propeller by using ANSYS CFX (ANSYS Inc.). Figure 3.60 shows the grid
system and boundary conditions. The meshes were automatically created by using the
ANSYS meshing application by setting the element size of the propeller, rotational
domain, and rotational periodicity planes to be 0.25, 15, and 6 mm, respectively, as
illustrated in Fig. 3.60a–c. Note that the inflation layers were generated around the
propeller surface to accurately capture the velocity gradients near no-slip walls, and
the meshes at the wingtip, leading edge, and trailing edge were clustered for resolving
the small vortices generated from the edges of the propeller as shown in Fig. 3.60c,
d. The total mesh number for the analysis of the basic propeller was approximately
10 million. The LES WALE model was adopted as the turbulence model for the
transition from laminar flow to the developed turbulent flow. Considering the total
weight of Phantom 3, the rotational speed was set to 5,400 rpm, which is almost the
same frequency when Phantom 3 is hovering. The total number of times step was set
to 1,800, and the rotational angle varied from 0◦ to 180◦ . Note that the steady-state
analysis was performed before the time transient analysis, and the result was used as
the initial flow condition for the transient analysis. In this study, the coefficients of
the lift and drag forces C L and C D of a propeller are defined in the same way as in
[46].
L
CL = , (3.14)
0.5ρa U 2 S2
Q
CD = , (3.15)
0.5ρa U 2 S3
where L is the lift force, Q is the torque about the rotational axis, ρa is the density of
air, U is the speed of the wingtip, S2 and S3 are the second and third moment of the
wing area, respectively. In order to evaluate the efficiency considering the differences
in the lift forces due to the additional structures in the tested models, the figure of
merit, FM, of a propeller is defined as
PR F
FM = , (3.16)
PC F D
Fig. 3.60 a Grid systems and boundary conditions. Blue and red domains are set to be (b) static
and (c) rotational domains, respectively. d Cross section of mesh at 80% of wing length
where PC F D is given by the product of Q and the angular velocity, and PR F is the
minimum power for generating the resultant lift force obtained from the numerical
analysis derived by using the Rankin-Froude momentum theory [47].
L
PR F = L , (3.17)
2ρa A0
where A0 is the area of the actuator disk defined by the length of propellers, R.
3.5.3 Results and Discussion
Examples of the tested propeller models are shown in Fig. 3.61. Compared to the
basic propeller, the serration, winglet, and velvet surface models generated a similar
level of noise when they were attached to the hovering drone. On the other hand, the
serration and velvet surface models are found to reduce higher frequency noise around
2,000 Hz when the propellers rotate at the same speed with that of the basic propeller.
In this study, the results of the model with the propellers with the attachment at trailing
edge (Fig. 3.61a, b), which were designed after testing several models inspired by
the trailing-edge fringes, are further discussed. These models exhibited the most
preferable performance for the development of the low-noise propeller.
Figure 3.61d shows the propeller that has an additional structure of an aluminum
plate of dimension 10 × 20 mm with a thickness of 1 mm at the trailing edge of
a propeller for Phantom 3. Note that the size of the plate was designed such that
it maintains the flight efficiency of the basic model. This plate was attached to the
propeller along the trailing edge on the lower surface with an overlapped region of
dimension 2 × 20 mm for bonding the propeller and the plate. The attached area
was smoothened by a piece of tape. The experiment for measuring the overall sound
pressure level and numerical analyses for evaluating FMs revealed that the noise
Fig. 3.61 a The flat and b curved plates are attached to the trailing edge of propellers for Phantom
3 and PF1, respectively
Fig. 3.62 Overall sound pressure level and figure of merit of the propellers with the attachment at
trailing edge for a Phantom 3 and b PF1. Each panel summarizes the effects of a spanwise position
and b curvature. Blue and red circles represent the basic models and the models for further analyses,
respectively
is reduced without a pronounced reduction in efficiency when the plate is attached

at 0.8R from the wing base as shown in Fig. 3.62 (red), while the noise level is
increased by attaching the plate at 0.7 or 0.9R. Through the flow visualization, it
is found that the plate position strongly affects the vortical structures. Figure 3.63
shows the pressure distribution on the upper surface and iso-surface of the Q value
around each model. Note that the rotational speed of the basic propeller and that
of the low-noise propellers were assumed to be approximately 5,400 and 4,800 rpm,
respectively from the measured noise characteristics. It can be seen that, if the plate is
attached outboard, the negative pressure region on the surface of the wing is gradually
increased, and the wingtip vortex is enlarged. The decrease in the FM with the plate
at 0.9R is thought to be due to the enlargement of the wingtip vortex, which increases
the drag. The increase in the noise level in the 0.7R model is thought to be due to
the vortex interaction between the shedding trailing edge vortex of the propeller and
the wingtip vortex of the plate as indicated by the red circle in Fig. 3.63b, whereas
Fig. 3.63 Pressure distributions on upper surfaces of a basic and plate attached propellers at b
0.7R, c 0.8R, d 0.9R. The iso-surfaces of Q-criterion at 0.004 are shown by grey
Fig. 3.64 Frequency

spectrum of a a hovering
Phantom 3 and b the
propellers for PF1. Red and
black lines represent the
basic and low-noise
propellers, respectively
the other models seem to suppress the generation of the small shedding vortex by
the attached plate.
Based on the parametric study of the position of the plate, the 0.8R model was
adopted as the low-noise propeller, and it is evaluated in detail. The frequency spec-
trum of the measured noise of the hovering drone with the basic and low-noise
propellers are illustrated in Fig. 3.63a. It is observed that the sound pressure level of
the low-noise propeller model was uniformly suppressed within a frequency range
between 200 and 20,000 Hz. The Z-weighted sound pressure level of the basic and
low-noise propellers are 72.5 and 70.1 dB, respectively. It is confirmed that the addi-
tional structure at trailing edge can suppress about 2.4 dB with the same amount of
lift generation.
Based on the results of the propeller for Phantom 3, the plate is attached at the
trailing edge of the propeller for PF1. Aiming at further noise reduction, the area of
the attached plate is enlarged in comparison with the plate attached to the propeller
for Phantom 3.
Systematic experiment was performed to see the effect of the positions and shapes
of the attachment, which revealed that the plate attachment close to the wing tip is
also effective in spite of the differences in the airfoil, planform and size between
Phantom 3 and PF1.
After such trial and errors, it is found that there is a trade-off between the acoustic
and aerodynamic performances. As a typical example, the effect of the curvature
radius of the plate, ρ, on the overall noise level and the aerodynamic efficiency is
summarized in Fig. 3.62b. In comparison with the basic model, the plate with higher
curvature radius reduces the noise level and FMs. With decreasing curvature radius,
the noise level and FMs are further decreased. These results imply that the noise
can be reduced by attaching the plate at trailing edge, but the shape of the propeller
becomes suboptimal aerodynamically. From the frequency spectrum (Fig. 3.64b), it
is confirmed that the noise with the frequency range between 2,000 and 20,000 Hz
are mainly decreased by attaching the plate (ρ = 30).
The noise reduction by the plate attachment is attributable to the reduction in
rotational speed associated with the increase of the wing area. Tables 1 list the
morphological parameters and aerodynamic performances of the basic and low-noise
propellers for Phantom 3 and PF1. Note that the vertical force and counter torque of
the propellers for Phantom 3 is estimated from the numerical analyses, while those of
the propellers for PF1 is estimated from the force measurement. The lift and torque
of the low-noise propeller for Phantom 3 are equally increased, which is due to the
increase in the second and third moment of the wing area as can be seen from the lift
and torque coefficients. It is also confirmed that the FM are maintained at the similar
level. By attaching the plate, the rotational speed is decreased in order to maintain
the lift force because of the increase of the second moment of wing area. The noise
induced by a rotating wing is known to be affected strongly by the rotational speed
of the wing. Therefore, the noise in the measurement of Phantom 3 with low noise
propeller, and single low-noise propeller for PF1, is likely decreased mainly by the
reduction in the rotational speed with the increase in the wing area of the low-noise
propeller. From the flow visualizations (Fig. 3.63a, c), another possible reason for
reduction in the total noise level by attaching the plate may be owing to the decrease
in the small shedding vortices from the trailing edge around the wingtip, which is
thought to work for increasing the noise level within a wide range of frequencies.
3.5.4 Conclusions
In this study, a low-noise propeller for two types of drones inspired by the unique
wing morphologies of owls were developed. Through various tests with an attached
plate inspired by trailing edge fringes of owls, it was found that the additional plate
attached to the trailing edge could achieve effective suppression of the noise level.
Spanwise position of the plate was determined by the relative position of the wingtip
vortex. By introducing a propeller with the plate attached at 0.8R from the wing base,
the noise induced by a drone, Phantom 3 in hovering, was successfully suppressed by
more than 2 dB whereas the power consumption was maintained at the similar level.
Such bio-inspired attachment was further confirmed to be effective in reducing noise
for a larger propeller of the PF1, but the power consumption was observed to slightly
increase due to a trade-off between acoustic and aerodynamic performances. A key
mechanism associated with the noise reduction in the bio-inspired propeller is likely
because of the combination of a reduction in rotational speed with increasing the wing
area and a flow control in the shedding of trailing-edge and wing-tip vortices. While
further optimization of the biomimetic wing design is necessary, our results point to
the effectiveness and feasibility of the bio-inspired approach [48–51] on improving
the aero-acoustic performance toward the development of novel low-noise drones.
(JST) Agency.
References
1. https://www.statista.com/statistics/510894/natural-disasters-globally-and-economic-losses/
2. Murphy, R.R.: Disaster Robotics. MIT Press, Cambridge (2014)
3. Valavanis, K.P., Vachtsevanos, G.J. (eds.): Handbook of Unmanned Aerial Vehicles, pp. 529–
710, 2997–3000. Springer, Berlin (2015)
4. Office of the Secretary of Defense, Unmanned Aircraft Systems Roadmap, 2005–2030
5. Nonami, K., Kartidjo, M., Yoon, K.J., Budiyono, A. (eds.): Autonomous Control Systems and
Vehicles: Intelligent Unmanned Systems, pp. 55–71. Springer, Berlin (2013)
6. Quan, Q.: Introduction to Multicopter Design and Control. Springer, Berlin (2017)
7. Nonami, K.: All of the Drone Industry Application, pp. 179–183. Ohmsha (2018)
8. http://www.kantei.go.jp/jp/singi/kogatamujinki/pdf/shiryou6.pdf
9. Nonami, K., Kendoul, F., Wan, W., Suzuki, S., Nakazawa, D.: Autonomous Flying Robots.
Springer, Berlin (2010)
10. Nonami, K., Inoue, S.: Research on sub-committee of aerial robot of ImPACT/TRC. J. Robot.
Soc. Jpn. 35(10), 700–706 (2017)
11. de Bree, H.-E.: Acoustic vector sensors increasing UAV’s situational awareness, SAE Technical
Paper, pp. 2009–01–3249 (2009). https://doi.org/10.4271/2009-01-3249
12. Kaushik, B., Nance, D., Ahuj, K.K.: A review of the role of acoustic sensors in the mod-
ern battlefield. In: Proceedings of 11th AIAA/CEAS Aeroacoustics Conference (26th AIAA
Aeroacoustics Conference), pp. 1–13 (2005)
13. Basiri, M., Schill, F., Lima, P.U., Floreano, D.: Robust acoustic source localization of emergency
signals from micro air vehicles. In: Proceedings of the IEEE/RSJ International Conference on
Robots and Intelligent Systems (IROS), pp. 4737–4742 (2012)
14. Okutani, K., Yoshida, T., Nakamura, K., Nakadai, K.: Outdoor auditory scene analysis using
a moving microphone array embedded in a quadrocopter. In: Proceedings of the IEEE/RSJ
International Conference on Robots and Intelligent Systems (IROS), pp. 3288–3293 (2012)
15. Schmidt, R.O.: Multiple emitter location and signal parameter estimation. IEEE Trans. Anten-
nas Propag. 34(3), 276–280 (1986). https://doi.org/10.1109/TAP.1986.1143830
16. Ohata, T., Nakamura, K., Mizumoto, T., Tezuka, T., Nakadai, K.: Improvement in outdoor
sound source detection using a quadrotor-embedded microphone array. In: Proceedings of the
IEEE/RSJ International Conference on Robots and Intelligent Systems (IROS), pp. 1902–1907
(2014)
17. Furukawa, K., Okutani, K., Nagira, K., Otsuka, T., Itoyama, K., Nakadai, K., Okuno, H.G.:
Noise correlation matrix estimation for improving sound source localization by multirotor UAV.
In: Proceedings of the IEEE/RSJ International Conference on Robots and Intelligent Systems
(IROS), pp. 3943–3948 (2013)
18. Hoshiba, K., Washizaki, K., Wakabayashi, M., Ishiki, T., Kumon, M., Bando, Y., Gabriel, D.,
Nakadai, K., Okuno, H.G.: Design of UAV-embedded microphone array system for sound
source localization in outdoor environments. Sensors 17(11), 1–16 (2017). https://doi.org/10.
3390/s17112535
19. Nakadai, K., Takahashi, T., Okuno, H.G., Nakajima, H., Hasegawa, Y., Tsujino, H.: Design
and Implementation of robot audition system ‘HARK’–open source software for listening to
three simultaneous speakers. Adv. Robot. 24(5–6), 739–761 (2010). https://doi.org/10.1163/
016918610X493561
20. Wakabayashi, M., Kumon, M.: Position estimation of multiple sound sources on the ground
by multirotor helicopter with microphone array. JSAI Technical report SIG-Challenge-049-3,
pp. 15–22 (2017). (in Japanese)
21. Konstantinova, P., Udvarev, A., Semerdjiev, T.: A study of a target tracking algorithm using
global nearest neighbor approach. In: Proceedings of International Conference on Computer
Systems and Technologies (CompSysTech), pp. 290–295 (2003)
22. Washizaki, K., Wakabayashi, M., Kumon, M.: Position estimation of sound source on ground
by multirotor helicopter with microphone array. In: Proceedings of the IEEE/RSJ International
Conference on Robots and Intelligent Systems (IROS), pp. 1980–1985 (2016)
23. Hoshiba, K., Sugiyama, O., Nagamine, A., Kojima, R., Kumon, M., Nakadai, K.: Design and
assessment of sound source localization system with a UAV-embedded microphone array. J.
Robot. Mechatron. 29(1), 154–167 (2017). https://doi.org/10.20965/jrm.2017.p0154
24. Morito, T., Sugiyama, O., Kojima, R., Nakadai, K.: Partially shared deep neural network in
sound source separation and identification using a UAV-embedded microphone array. In: Pro-
ceedings of the IEEE/RSJ International Conference on Robots and Intelligent Systems (IROS),
pp. 1299–1304 (2016)
25. Morito, T., Sugiyama, O., Kojima, R., Nakadai, K.: Reduction of computational cost using
two-stage deep neural network for training for denoising and sound source identification. In:
Proceedings of the IEA/AIE 2016 Trends in Applied Knowledge-Based Systems and Data
Science. Series Lecture Notes in Computer Science, vol. 9799, pp. 562–573 (2016)
26. Sugiyama, O., Uemura, S., Nagamine, A., Kojima, R., Nakamura, K., Nakadai, K.: Outdoor
acoustic event identification with DNN using a quadrotor-embedded microphone array. J.
Robot. Mechatron. 29(1), 188–197 (2017). https://doi.org/10.20965/jrm.2017.p0188
27. Sunada, S., Tanabe, Y., Yonezawa, K., Tokutake, H.: Improvement of flight performance of
minisurveyor by using pitch control and decreasing the number of rotors. In: 2016 Asia-Pacific
International Symposium on Aerospace Technology, Toyama, Japan (2017)
28. Yonezawa, K., Yoshida, N., Sugiyama, K., Tokutake, H., Tanabe, Y., Sunada, S.: Devel-
opment of a multicopter with ducted and variable pitch rotors. In: Proceedings of the 5th
Asian/Australian Rotorcraft Forum, American Helicopter Society International, Fairfax, Vir-
ginia. U.S. (2016)
29. Yonezawa, K., Matsumoto, H., Sugiyama, K., Tokutake, H., Tanabe, Y., Sunada, S.: Develop-
ment of a ducted rotor for multicopters. In: 6th Asian/Australian Rotorcraft Forum(ARF) and
Heli Japan 2017, Kanazawa, Japan (2017)
30. Hrishikeshavan, V., Sirohi, J., Tishchenko, M., Chopra, I.: Design, development, and testing
of a shrouded single-rotor micro air vehicle with antitorque vanes. J. Am. Helicopter Soc. 56,
012008-1–012008-11 (2011)
31. Hrishikeshavan, V., Black, J., Chopra, I.: Design and performance of a quad-shrouded rotor
micro air vehicle. J. Aircr. 51, 779–791 (2014)
32. Tanabe, Y., Saito, S., Sugawara, H.: Construction and validation of an analysis tool chain for
rotorcraft active noise reduction, In: Proceedings of 38th European Rotorcraft Forlum (ERF),
Netherlands Association of Aeronautical Engineers, Amsterdam, NL (2012)
33. Tanabe, Y., Sugiura, M., Aoyama, T., Sugawara, H., Sunada, S., Yonezawa, K., Tokutake, H.:
Multiple rotors hovering near an upper or a side wall. J. Robot. Mechatron. 30, 344–353 (2018)
34. Torita, R., Kishi, T., Tokutake, H., Sunada, S., Tanabe, Y., Yonezawa, K.: Modeling of aero-
dynamic characteristics of drone and improvement of gust response. In: 6th Asian/Australian
Rotorcraft Forum(ARF) and Heli Japan 2017, Kanazawa, Japan (2017)
35. Kim, S., Choi, S., Kim, H.J.: Aerial manipulation using a quadrotor with a two DOF robotic
arm. In: Proceedings of the 2013 IEEE/RSJ International Conference on Intelligent Robots and
Systems, pp. 4990–4995 (2013)
36. Kim, S., Seo, H., Choi, S., Kim, H.J.: Vision-guided aerial manipulation using a multirotor
with a robotic arm. IEEE/ASME Trans. Mechatron. 21(4), 1912–1923 (2016)
37. Lippiello, V., Cacace, J., Santamaria-Navarro, A., Andrade-Cetto, J., Trujillo, M.À., Esteves,
Y.R., Viguria, A.: Hybrid visual servoing with hierarchical task composition for aerial manip-
ulation. IEEE Robot. Autom. Lett. 1(1), 259–266 (2016)
38. Ohnishi, Y., Takaki, T., Aoyama, T., Ishii, I.: Development of a 4-Joint 3-DOF robotic arm with
anti-reaction force mechanism for a multicopter. In: Proceedings of the 2013 2017 IEEE/RSJ
International Conference on Intelligent Robots and Systems, pp. 985–991 (2017)
39. Parsons, B.N.V., Stanton, S.: Linear actuator. US Patent 4,794,810 (1989)
40. Nonami, K.: Drone technology, cutting-edge drone business, and future prospects. J. Robot.
Mech. 28, 262–272 (2016)
41. Rothstein, A.: Drone (Object Lessons). Bloomsbury Publishing, New York (2015)
42. Christian, A., Cabell, R.: Initial investigation into the psychoacoustic properties of small
unmanned aerial system noise. In: 23rd AIAA/CEAS Aeroacoustics Conference, AIAA AVI-
ATION Forum, AIAA 2017-4051 (2017)
43. Wagner, H., Weger, M., Klaas, M., Schro, W.: Features of owl wings that promote silent flight.
Interface Focus 7, 20160078 (2017)
44. Ikeda, T., Ueda, T., Nakata, T., Noda, R., Tanaka, H., Fujii, T., Liu, H.: Morphology effects
of leading-edge serrations on aerodynamic force production: an integrated study with PIV and
force measurements. J. Bionic Eng. 14, 661–672 (2018)
45. Rao, C., Ikeda, T., Nakata, T., Liu, H.: Owl-inspired leading-edge serrations play a crucial role
in aerodynamic force production and sound suppression. Bioinspiration Biomim. 12, 046008
(2017)
46. Usherwood, J.R., Ellington, C.P.: The aerodynamics of revolving wings I. Model hawkmoth
wings. J. Exp. Biol. 205, 1547–1564 (2002)
47. Ellington, C.P.: The aerodynamics of hovering insect flight. V. A vortex theory. Philos. Trans.
R. Soc. Lond. B 305, 115–144 (1984)
48. Henderson, L., Glaser, T., Kuester, L.: Towards bio-inspired structural design of a 3D printable,
ballistically deployable, multi-rotor UAV. In: 2017 IEEE Aerospace Conference (2017)
49. Keshavan, J., Gremillion, G., Escobar-Alravez, H., Humbert, J.S.: A μ analysis-based
controller-synthesis framework for robust bioinspired visual navigation in less-structured envi-
ronments. Bioinspiration Biomim. 9, 025011 (2014)
50. Mintichev, S., Rivaz, S., Floreano, D.: Insect-inspired mechanical resilience for multicopters.
IEEE Robot. Autom. Lett. 2, 1248–1255 (2017)
51. Sabo, C., Cope, A., Gurny, K., Vasilaki, E., Marshall, J.A.: Bio-inspired visual navigation for
a quadcopter using optic flow. In: AIAA Infotech @ Aerospace, AIAA SciTech Forum, AIAA
2016-0404 (2016)
Chapter 4
Cyber-Enhanced Rescue Canine
Kazunori Ohno, Ryunosuke Hamada, Tatsuya Hoshi, Hiroyuki Nishinoma,

Shumpei Yamaguchi, Solvi Arnold, Kimitoshi Yamazaki, Takefumi Kikusui,
Satoko Matsubara, Miho Nagasawa, Takatomi Kubo, Eri Nakahara, Yuki
Maruno, Kazushi Ikeda, Toshitaka Yamakawa, Takeshi Tokuyama, Ayumi
Shinohara, Ryo Yoshinaka, Diptarama Hendrian, Kaizaburo Chubachi,
Satoshi Kobayashi, Katsuhito Nakashima, Hiroaki Naganuma, Ryu
Wakimoto, Shu Ishikawa, Tatsuki Miura and Satoshi Tadokoro
Abstract This chapter introduces cyber-enhanced rescue canines that digitally

strengthen the capability of search and rescue (SAR) dogs using robotics technology.
A SAR dog wears a cyber-enhanced rescue canine (CRC) suit equipped with sensors
(Camera, IMUs, and GNSS). The activities of the SAR dog and its surrounding view
and sound are measured by the sensors mounted on the CRC suit. The sensor data
are used to visualize the viewing scene of the SAR dog, its trajectory, its behavior
(walk, run, bark, among others), and its internal state via cloud services (Amazon
K. Ohno (B)
NICHe, Tohoku University/RIKEN AIP, Aramaki Aza Aoba 6-6-01,
Aoba-ku, Sendai-shi, Miyagi, Japan
e-mail: kazunori@rm.is.tohoku.ac.jp
R. Hamada
NICHe, Tohoku University, Aramaki Aza Aoba 6-6-01, Aoba-ku,
Sendai-shi, Miyagi, Japan
e-mail: hamada@rm.is.tohoku.ac.jp
T. Hoshi · H. Nishinoma · S. Yamaguchi · S. Tadokoro
GSIS, Tohoku University, Aramaki Aza Aoba 6-6-01, Aoba-ku,
S. Arnold · K. Yamazaki
Shinshu University, Wakasato 4-17-1, Nagano, Japan
e-mail: s_arnold@shinshu-u.ac.jp
K. Yamazaki
e-mail: kyamazaki@shinshu-u.ac.jp
T. Kikusui · S. Matsubara · M. Nagasawa
Azabu University, Sagamihara, Kanagawa, Japan
e-mail: takkiku@carazabu.com
M. Nagasawa
e-mail: nagasawa@carazabu.com

https://doi.org/10.1007/978-3-030-05321-5_4
144 K. Ohno et al.
Web Services (AWS), Google Maps, and camera server). The trajectory can be
plotted on an aerial photograph captured by flying robots or disaster response robots.
The visualization results can be confirmed in real time via the cloud servers on the
tablet terminal located in the command headquarters and with the handler. We devel-
oped various types of CRC suits that can measure the activities of large- and medium-
size SAR dogs through non-invasive sensors on the CRC suits, and we visualized
the activities from the sensor data. In addition, a practical CRC suit was developed
with a company and evaluated using actual SAR dogs certified by the Japan Rescue
Dog Association (JRDA). Through the ImPACT Tough Robotics Challenge, tough
sensing technologies for CRC suits are developed to visualize the activities of SAR
dogs. The primary contributions of our research include the following six topics. (1)
Lightweight CRC suits were developed and evaluated. (2) Objects left by victims
were automatically found using images from a camera mounted on the CRC suits.
A deep neural network was used to find suitable features for searching for objects
left by victims. (3) The emotions (positive as well as negative) of SAR dogs were
estimated from their heart rate variation, which was measured by CRC inner suits.
(4) The behaviors of SAR dogs were estimated from an IMU sensor mounted on
the CRC suit. (5) The visual SLAM and inertial navigation systems for SAR dogs
were developed to estimate trajectory in non-GNSS environments. These emotions,
movements, and trajectories are used to visualize the search activities of the SAR
dogs. (6) The dog was trained to search an area by controlling the dog with the laser
light sources mounted on the CRC suit.
T. Kubo · K. Ikeda
GSST, Nara Institute of Science and Technology, 8916-5 Takayama-cho,
Ikoma-shi, Nara, Japan
E. Nakahara
GSIS, Nara Institute of Science and Technology, 8916-5 Takayama-cho,
Ikoma-shi, Nara, Japan
Y. Maruno
Faculty for the Study of Contemporary Society, Kyoto Women’s University,
35 Imagumano Kitahiyoshi-cho, Higashiyama-ku, Kyoto, Japan
T. Yamakawa
Kumamoto University, 2-39-1 Kurokami, Chuo-ku,
Kumamoto-shi, Kumamoto, Japan
e-mail: yamakawa@cs.kumamoto-u.ac.jp
T. Tokuyama
e-mail: tokuyama@dais.tohoku.ac.jp
A. Shinohara · R. Yoshinaka · D. Hendrian · K. Chubachi · S. Kobayashi
K. Nakashima · H. Naganuma · R. Wakimoto · S. Ishikawa · T. Miura
e-mail: ayumi@ecei.tohoku.ac.jp
4 Cyber-Enhanced Rescue Canine 145
4.1 Overview of Cyber-Enhanced Rescue Canine
4.1.1 Introduction
Cyber-enhanced rescue canines (CRC) are search and rescue (SAR) dogs that wear
cyber-enhanced rescue canine (CRC) suits. These suits digitally strengthen the capa-
bility of a SAR dog (Fig. 4.1). SAR dogs with the CRC suits can realize a new
approach to victim exploration by combining the sensing technique of disaster
response robots with the inherit abilities of SAR dogs and conveying them to rescue
workers. Figure 4.2 shows CRC suit No.4 worn by a SAR dog, which is certified by
Japan Rescue Dog Association (JRDA). We have been developing these CRC suits
since 2011.
Robin Murphy suggests in “Disaster Robotics” that SAR dogs can play a com-
plementary role with rescue robots [33]. Information gathering in disaster areas is
the first step to finding victims and to keeping the victims and rescue workers from
danger. Disaster response robot researchers have been developing exploration robots
that contribute to quick and safe information gathering. Various types of robots such
as tracked vehicles, flying robots, and underwater vehicles have been developed for
exploration [27, 32, 34], and it has become possible to share information about dis-
aster scenarios as digital data in real time. However, in actual disaster areas, SAR
dogs have superior search ability, and thus play an important role in rescue activities.
Fig. 4.1 Concept of cyber-enhanced rescue canine: search and rescue (SAR) dogs are digitally
strengthened using robotics technologies
146 K. Ohno et al.
Fig. 4.2 Cyber-enhanced rescue canine: a SAR dog certified by JRDA (name: Gonta) wore a No.
4 cyber-enhanced rescue canine (CRC) suit No. 4 and searched for a victim beneath the rubble at
the ImPACT Tough Robotics Challenge (TRC) test field
In the future research on disaster response robots, it is necessary to develop new

technologies for sharing and mutually utilizing the information gathered by robots
and SAR dogs during exploration.
4.1.2 Search and Rescue Dog Challenges
The use of SAR dogs is becoming more common in rescue missions at disaster sites.
SAR dogs are trained to search and find the odors released from victims and show
a final response behavior, such as barking or sitting. The target odors become dis-
criminative stimuli by associating with pleasant consequences. This conditioning is
accomplished by pairing target odors (stimuli) and discoveries (response) with a high
value reward (reinforcement) such as toys, food, or play. Final response behaviors,
such as barking or sitting are usually trained separately and then paired with the target
odor [5]. In a real situation, SAR dogs can quickly locate the positions of victims
based on their keen sense of the target odors. SAR dogs have been used in actual
disasters, such as the Jiji Earthquake in Taiwan in 1999 [7], and Great East Japan
Earthquake in 2011.
At the disaster site, SAR dogs perform search activities for several hours, which
include waiting and moving. They conduct multiple 10-minute-long explorations at
specific intervals. In 10 minutes of exploration, SAR dogs can find multiple victims
hidden across multiple floors in a building.
However, SAR dogs face challenges when attempting to rescue victims in col-
laboration with rescue workers (e.g., firefighters in Japan). Rescue workers decide
the priority of rescue (i.e. triage) based on information gathered from the disaster
site. This triage requires detailed information, such as the number of victims, state of
injury, and surrounding circumstances, in addition to the location of the victims. SAR
dogs see this information during the exploration process, but they cannot explain it in
words. If the SAR dogs can report detailed information about the victims, the loca-
tions they have explored, and the place and characteristics of the place where they
found a victim, the rescue workers will be able to rescue a victim more efficiently
and securely. By overlaying the search area with aerial photographs taken by a flying
robot, it becomes possible to understand the situation of the disaster site accurately.
Therefore, it is necessary to develop a method for sharing information to the handlers
on site and to the rescue workers at headquarters at a remote location about the area
explored and observed by the SAR dog.
These problems can be solved by attaching sensors to a SAR dog, measuring the
vision and behavior of the dog, and then by visualizing them. For example, the motion
and trajectory of the SAR dog can be estimated using a global navigation satellite
system (GNSS) or using inertial measurement unit (IMU) sensors attached to the
SAR dogs. These data can visualize the activities of SAR dogs during exploration.
Therefore, we have developed a tough sensing technology for the ImPACT tough
robotics challenge (TRC).
4.1.3 Cyber-Enhanced Rescue Canine and Key Associated

Technologies
Cyber-enhanced rescue canines overcome the drawbacks of SAR dogs using robotics
technology, especially recording and visualizing technology. Figure 4.3 shows the
system overview of a cyber-enhanced rescue canine. A SAR dog wears a CRC suit
equipped with sensors (Camera, IMUs, GNSS). The activities of the SAR dog and
its surrounding vision and sound are measured by the sensors mounted on the CRC
suit. The sensor data are used to visualize the scene witnessed by the SAR dog, its
trajectory, its behavior (walking, running, barking, etc.), and its internal state via
cloud services (Amazon Web Services (AWS), Google Maps, camera server). The
trajectory can be plotted on aerial photographs taken by flying robots or disaster
response robots. The visualization results can be confirmed in real time via the
cloud server on the tablet terminal, which is held by the handler or located at the
headquarters.
Fig. 4.3 System overview of the cyber-enhanced rescue canine

148 K. Ohno et al.
New Technologies for Enhancing Search Ability of SAR Dogs

To extend the search ability of SAR dogs, we have developed the following new
technologies.
I. Lightweight CRC suit that SAR dogs can wear for several hours
CRC suits are required to record and visualize the activities of a SAR dog. The
requirements of the CRC suits were determined through discussions with the
JRDA. We developed several types of CRC suits that are lightweight and do
not interfere with the movement of medium- and large-size SAR dogs, based
on the requirement. The suits are equipped with sensors that are suitable for
the measurement of SAR dog motions. Detailed information regarding the CRC
suits is described in Sect. 4.2.
II. Retroactive searching for objects recorded in SAR dog camera images
The images taken by a camera mounted on a SAR dog contain important clues
that can help us find victims. We find these clues using machine learning tech-
niques. The characteristics of the items left by victims are unknown at the begin-
ning of the mission. Acquaintances of the victim or rescue workers provide this
information during the mission. From the time when the visual features of the
items are identified, the searching for the items left behind is conducted retroac-
tively. We need to develop a searching method that can extract the features of
the items left behind at the spot. The details of this technology are described in
Sect. 4.3.
III. Estimation of emotional state of a SAR dog from its biological signal
It is necessary to grasp the internal emotional changes in SAR dogs during a
mission caused by tiredness and boredom. By grasping the emotional changes
of the SAR dog, it will be possible to rest the SAR dog at an appropriate timing.
Another SAR dog will continue the mission in place of the first SAR dog. This
system will be able to support the management of SAR dogs, which is performed
by handlers empirically. Details regarding the emotion estimation are described
in Sect. 4.4.
IV. Estimation of SAR dog behavior from sensors mounted on the CRC suit
The behavior of a SAR dog during a mission contains enough information to find
victims and to evaluate the quality of the mission. By observing the sniffing or
barking actions of the SAR dogs, we can identify locations where the SAR dog
finds vital reactions of victims with its olfaction. By observing motions such as
walking and running, handlers can understand the activities of the SAR dog and
judge whether it should take a rest or not. Therefore, we developed a method for
estimating the motion of a SAR dog using an IMU sensor mounted on the CRC
suit. Details of the motion estimation are described in Sect. 4.5.
V. Trajectory estimation in non-GNSS environment
Trajectory is important information that is used to visualize the activities of
a SAR dog. In recent years, compact and lightweight single-frequency GNSS
receivers have become available for mobile vehicles and mobile devices. All the
CRC suits shown in the figure are equipped with a small and lightweight GNSS
receiver, which enables it to output standalone positioning data online and RTK
positioning data offline. However, the target environment of the SAR dogs is
quite wide, and they can search for victims indoors and outdoors, under debris,
in forests, and in mountains. In forests or nearby buildings, GNSS cannot provide
good positioning data. To overcome this problem, we developed two different
positioning methods. One method is visual SLAM, which estimates the trajectory
in non-GNSS environments. The other is an internal navigation system based on
the gait of the dog. Details regarding these positioning methods are described in
Sect. 4.6.1.
VI. Remote instruction of SAR dog’s behaviors
The handler gives search instructions to the dog vocally or through hand gestures.
However, the dogs may also perform searches in places where the handlers cannot
be seen or where their voices cannot be heard. Even under such circumstances,
we need to be able to give instructions to the SAR dogs. Therefore, we developed
a new method for providing instructions. Laser light sources were used to instruct
the SAR dog at remote places. Details of the remote instruction are described in
Sect. 4.6.2.
4.1.4 Related Works of Cyber-Enhanced Rescue Canines
Because the concept of combining animals and sensors is new in robotics, few studies
have recorded animal activities in a real environment. A. Ferwon developed support
technology for police canines. Using network technology and sensors (a camera,
GPS), police can observe the scene witnessed by the dog, the sound, and its location
[13]. Their next target is search and rescue (SAR) dogs, and SAR dog activities
were monitored using network technologies and sensors [12]. Then, the Canine
Augmentation Technology (CAT) project recorded urban search and rescue (USAR)
dog activities [14, 40]. In the CAT project, USAR dogs investigated activities that
were monitored using cameras, microphones, and a GPS device. The results suggest
that the sensor data mounted on USAR dogs help first responders to understand
the situation in a disaster site. There was also an interesting approach utilized to
monitoring cat activities [46]. Cat activities were recorded using a camera, and the
recorded activities were tweeted via a mobile network. People shared the tweeted
data on the Internet. Tactical Electronic Co. developed several types of dog cameras
(for the chest and back) [23]. These cameras have been used to record dog activities
at disaster or accident scenes. These works have shown that the use of a camera and
GPS device is an efficient way of recording and visualizing a SAR dog activity, and
that cloud services can help in sharing information with multiple people at different
locations.
150 K. Ohno et al.
4.2 Development of CRC Suit
4.2.1 Lightweight CRC Suit that SAR Dog Can Wear for
Several Hours
We have developed lightweight CRC suits for medium- and large-size SAR dogs. In
domestic and foreign disaster sites, large dogs (approximately 30 kg in weight) such
as Shepherds, Belgian Milionises, and Labradors, and medium-size dogs (approx-
imately 15 kg in weight), such as Brittany, are used as SAR dogs. Therefore,
lightweight CRC suits that large- and medium-size SAR dogs can wear are nec-
essary.
Table 4.1 shows the requirements of the CRC suit. These were decided based
on discussions with the JRDA. The weight of the CRC suit was determined to be a
maximum of 10% of the dog body weight. Usually, 3 or 5% of the body weight is
used to design equipment for horse riding or bio-logging. However, we decided that
10% was not too heavy because SAR dogs have sufficient training, and they need
to work for only short durations (approximately two hours including the searching
time and waiting time). Camera specifications such as resolution and frame rate were
determined based on the requirements for showing the actual image. The measure-
ment range of the IMU sensors was decided by analyzing the motion capture data of
the walking, running, and jumping motions of a dog.
Table 4.1 Requirement of CRC suit

Recording data Forward view, its sound, trajectory
Camera View angle : over 90 degree
Resolution : over 1280 × 720 (HD) (Static)
Resolution : over 640 × 480 (SD) (Stream)
(It is acceptable to worse than this resolution at stream.)
IMU Acceleration: over ±58.8 m/s2
Angular velocity: over ±500 degree/s
Weight Less than 10 % of SAR dog weight
(Less than 1.5 kg for medium-size dog (weight 15 kg)
Heavy weight should be put near forelegs.)
Recording time Over 1.0 h
Battery life Over 2.0 h
(setup: 0.5 h + investigation: 1.5 h)
Wireless communication 20–30 m
Other Few protruding devices on the vest
Fig. 4.4 CRC suits worn by SAR dogs (certified by JRDA) during the training
Figure 4.4 shows some of the CRC suits developed from 2011 onward. The figure
shows four JRDA-certified SAR dogs wearing these CRC suits. Table 4.2 shows the
performance of these CRC suits, which were developed to satisfy the requirements
shown in Table 4.1. CRC suit No. 4 is the first prototype suit, and it can be worn
by medium- and large-size SAR dogs. No. 5 is the second prototype CRC suit, and
it has a low-latency video delivery system and a GNSS receiver, which can output
RTK-quality positioning data offline. CRC suit No. 6 is third prototype CRC suit, and
it has a heart rate monitoring system that is used to estimate the emotional state of
a SAR dog. CRC suit No. 7 is a product-ready CRC suit developed in collaboration
with the Japanese company, FURUNO ELECTRIC CO., LTD.
The common features of CRC suits No.4– No.7 are described as follows:
• Inner shape of the canine suit is balanced to prevent inclination.

• The CRC suit is designed to not to tighten around the dog when its posture changes.
• Cameras are mounted in a manner that prevents damage from occurring when
collisions occur.
• Rain-resistant CRC suit may not increase in weight when exposed to rain.
152 K. Ohno et al.
Table 4.2 Specification of CRC suits No.4, No.5, No.6, and No.7 for large- and medium-size SAR
dogs
Equipment CRC suit No.4 No.5 No.6 CRC suit No.7
Camera View angle: View angle: 160◦ View angle: 120◦
160◦
Static: Static: 1920×1080 Static: 1920 × 1080
1920×1080
Strm: 1280 × Stream: 1280 × 720, 5–10 fps Strm: 1280 × 720,
720, 5–10 fps 5–10 fps
Strm: 640 × Stream: 640 × 360, 5–10 fps Strm: 640 × 360,
360, 5–10 fps 5–10 fps
Strm: 320 ×
176, 5–15 fps
Audio 48 kHz, mono 48 kHz, monophonic 48 kHz, mono
IMU (back) Accel.: ±196 Accel.: ±156 m/s2 Accel.: ±58.8 m/s2
m/s2
Ang vel.: ±450 Ang vel.: ±2 kdeg/s Ang vel.: ±750 deg/s
deg/s
IMU (chest) – – Accel.: ±156 m/s2
Ang vel.: ±2 kdeg/s
GPS NMEA, 5 Hz NMEA, GNSS raw, 5 Hz NMEA, GNSS raw,
5 Hz
Wireless router Wireless LAN Wireless LAN, 3G/4G Wireless LAN, 3G/4G
Video strem Ustream WebRTC WebRTC
Electro- – – 3-point MX lead –
Cardiograph
Weight 1.3 kg 1.3 kg 1.7 kg 1.5 kg
Battery life min 2.0 h min 2.0 h min 2.0 h
4.2.2 System Configuration of CRC Suit
We explain the sensor layout and system configuration of the CRC suit using CRC
suit No. 7. Figure 4.5 shows the sensor layout and system configuration of CRC suit
No. 7. CRC suit No. 7 has a camera, a microphone, IMU sensors, a GNSS receiver, a
data recording device, batteries, and a wireless router. The recording device records
the journey of the SAR dog on a micro SD card.
A GNSS receiver and an IMU sensor were placed on the back of a canine to
measure the motion and position of the dog. The other IMU sensor was placed in
front of its chest to measure the motion of its front legs. A camera was fixed to the
left upper leg to obtain the vision of the dog and its jaw. We can judge the orientation
of the face of the dog from the direction of the jaw in the image. The layout of
other devices was adjusted to maintain balance between the left and right sides of
the canine suit.
Fig. 4.5 Sensor layout (Upper) and system configuration (Lower) of the CRC suit No.7
A mobile phone network and cloud server are used to transmit a live video stream
from the camera and the position and behaviors of the SAR dog to its handlers and a
command center in real time. The live video stream, which contains camera images
and audio, and other sensor data are stored separately into different cloud servers.
Multiple tablet terminals and PCs that the handlers and command headquarters have
can acquire the data in real time from these cloud servers and check the status of the
journey of the dog.
We used a low-latency video delivery system developed by SoftBank Corp. for
the CRC suits. This video delivery system has been installed in CRC suit No. 5
and later versions. WebRTC technology and hardware video encoders on the board
PC are used to generate high quality and low-latency video streams. The use of
WebRTC realizes automatic adjustment of video quality and frame rate according to
the network speed. In addition, when the Internet cannot be used, the video delivery
system can deliver the video stream to one tablet terminal that is located within the
range of the wireless LAN of the CRC suit. This is an important feature for rescue
missions. The measurement of the delay time on the 4G mobile phone network
showed an average delay of approximately 0.8 s in conditions with approximately
30% of movement shown on the screen.
154 K. Ohno et al.
4.2.3 GUI for Monitoring Cyber-Enhanced Rescue Canine

Behaviors
The handlers and commandheadquarters check the progress of the SAR dog mission
using tablet terminals or PCs, which are commercial products. We selected a browser
software program (Google Chrome) to display the activities of the SAR dog because
the browser can work on different operating systems (OS): iOS, Android, Windows,
etc. We developed a graphical user interface (GUI) for SAR dogs with CRC suits
using HTML and JavaScript. Figure 4.6 shows a prototype of the GUI. Figure 4.7
shows the current GUI, which was developed in collaboration with human interface
researchers and software corporations based on the GUI prototype.
We developed a GUI for practical usage, as shown in Fig. 4.7. The features of the
GUI are described as follows:
• The GUI can display the progress of three SAR dogs simultaneously.
• The video stream and map, which are mainly used by users, are displayed in a
large size on the GUI.
Fig. 4.6 Prototype of GUI for SAR dogs with CRC suits
Fig. 4.7 Practical GUI for SAR dogs with CRC suits
• In the video stream window, we can switch to the video stream of another dog by
clicking/tapping tabs named Dog 1, Dog 2, and Dog 3, which are located at the
top of the video stream window.
• In the map window, in addition to the existing Google Maps, we can display aerial
photographs taken by UAVs and display the trajectories of the three SAR dogs
on it. In a disaster site, it is difficult to grasp the situation of the site even if the
trajectory of the SAR dogs is displayed on the old satellite photograph. By placing
the current aerial photographs on the map, the search result can be confirmed
because the current situation is reflected in the images.
• The search area of a mission can be manually specified in the map windows.
• Optional functions are displayed as icons, and the functions can be switched by
clicking the icons or via gestures (like swiping etc.). The video stream is captured
by clicking the corresponding icon. The results of a retroactive object search and
SAR dog behavior, which are explained in later sections, are displayed by clicking
their icons.
4.2.4 Field Tests
We conducted a field test of the CRC suit in collaboration with the JRDA. Four
JRDA-certified SAR dogs wore CRC suits during their training. Figure 4.8 shows
a portion of the explorations. Over 20 explorations were recorded and visualized in
the field test for over three years. The SAR dogs found several victims during the
exploration. Figure 4.9 shows victims taken by a camera of the CRC suit.
These explorations included training with domestic firefighters in Japan and the
Japan international rescue team. Italian mountain rescue dogs wore the No. 4 CRC
suit in the Alps as field trials. A 3G/4G mobile phone network was used in places of
the Alps. Figure 4.10 shows a mountain rescue dog that wore the CRC suit during
exploration. One hidden person was found, and its location and visual information
were confirmed using the GUI. Through these field tests with users, we obtained
good suggestions for improving the CRC suit and its GUI. We incorporated many
of their comments (CRC suit design, drawing searching area on map, etc.) when
making the practical CRC suit No. 7 and the practical GUI shown in Fig. 4.7.
The practical CRC suit No. 7 and the practical GUI were evaluated in JRDA
training (Fig. 4.11) and ImPACT-TRC field testing (Fig. 4.12). SAR dogs certified
by the JRDA wore the No. 7 CRC suit and searched for victims in both fields. We
succeeded in recording and visualizing the exploration of the SAR dogs. After these
field tests, the No. 7 CRC suit and the practical GUI were lent to JRDA for additional
testing.
156 K. Ohno et al.
Fig. 4.8 Field tests of the CRC suit in various environments
Fig. 4.9 Victims taken by a camera mounted on the CRC suit
Fig. 4.10 Field test by an Italian mountain rescue dog in the Alps
Fig. 4.11 Field test of practical CRC suit No. 7 on simulated rubble in JRDA training field
Fig. 4.12 Field test of practical CRC suit No. 7 in ImPACT-TRC test field
4.3 Image Recognition for Search Support
4.3.1 Objective and Contribution
In this section, we discuss an image recognition system developed to support search

activities in disaster response, and its integration with robotic platforms, focusing on
the Cyber-enhanced Rescue Canine (CRC) platform. The system uses the camera,
GNSS sensor, and radio equipment installed in the cyber suit. The system has also
been integrated with the Active Scope Camera (ASC). The contributions of this work
are as follows.
(A) The proposed system allows on-the-spot and mid-operation registration of
recognition targets for flexible applicability at disaster sites.
(B) To provide an intuitive overview of when, where, and what was observed, we
implemented a GUI that combines recognition results with position measure-
ments from the GNSS unit and displays them spatially.
(C) In additional to standard “forward” recognition, our system is capable of rapid
“backtrack” recognition over past sensor data. This makes it possible to respond
quickly when new search targets are added in the middle of the search activity.
(D) Through experiments at the robot test field, we have qualitatively evaluated
the recognition capabilities of the system and identified avenues for future
development.
158 K. Ohno et al.
4.3.1.1 Background Work in Computer Vision
Recent years have seen substantial improvements in image recognition. The main
drive behind this progress comes from advances in neural network technology. Con-
volutional neural networks (CNNs), first proposed in the 1980s and 1990s [17, 30],
have become the technology of choice for many computer vision applications in
recent year. However, training CNNs for object recognition generally requires large
amounts of labeled training data, and long training times, which is an obstacle for
our intended use scenario, as we cannot assume recognition targets to be known in
advance of operation.
Another strand of research in the field of neural networks is unsupervised learning
by means of autoencoder networks. Unsupervised learning has long been studied
as a means of finding latent structure in data, which can be applied for a variety
of purposes such as compression, visualization, and feature extraction (but, on its
own, not recognition). Convolutional variants of the autoencoder architecture [22]
are particularly apt at compressing and decompressing images, and in the process
extract features that can be used for a variety of goals. The present work combines
such feature extraction with a recognition algorithm.
4.3.1.2 Challenges for Image Recognition in Disaster Response
Image recognition has a long history of research. However, the context of disaster
response poses a unique set of challenges that conventional methods are not well
equipped for. Below we list some of the major challenges, along with the measures
adopted here to address them.
(A) Image quality: Images captured by a SAR dog might be blurred due to sudden
movement or change in direction. In addition, due to the often limited bandwidth
of wireless networks, high image quality cannot be expected. We adopt a neural-
network-based feature extraction strategy to obtain useful features in the face of
blur and compression damage. We also adopt a recognition algorithm that takes
whole-frame context into account, which helps increase robustness against visual
discrepancies between the target as defined and as encountered.
(B) High diversity of recognition targets and operation environments: Search tar-
gets vary on a case-by-case basis, as do search environments. Therefore, it is
difficult to collect appropriate training data beforehand. Hence the system should
be capable of handling new environments and recognition targets with minimal
preparation. We combine unsupervised pre-training for general purpose feature
extraction with a recognition algorithm that requires no training. The system
further includes a mechanism for quick acquisition of recognition targets from
minimal instruction.
(C) Real time operation: To be of use during live operation, footage has to be
processed on the spot, in real time. However, the mobile nature of search operations
also requires system portability. Our feature extraction method makes it possible
to efficiently extract large numbers of features in parallel on the GPU. Similarity

measures used by the recognition algorithm too are highly parallelizable, and
run on GPUs as well. We further parallelize the steps of the recognition pipeline
by means of aggressive multi-threading, realizing real time performance on a
GPU-equipped laptop.
4.3.2 Method
Considering the demands and challenges discussed above, we designed an image

recognition system with the architecture shown in Fig. 4.13. We start our description
with the fully convolutional autoencoder (fCAE) at its core.
Fig. 4.13 System overview (autoencoder not to size)

160 K. Ohno et al.
4.3.2.1 Fully Convolutional Autoencoder Architecture
The architecture of the fCAE is given in Table 4.3. It consists of an input layer (IN),
three encoder layers (E1 to E3), a bottleneck layer (B), three decoder layers (D3 to
D1), and an output layer (OUT). Encoder and decoder layers with the same number
have the same spatial resolution. Weights are initialized randomly from a Gaussian
distribution with a mean of 0 and a standard deviation of 0.04. The hyperbolic tangent
activation function is used for all layers. The architecture is kept fairly small in
consideration of the need for real time performance on a laptop. We use striding to
incrementally reduce the resolution of the feature maps between the convolutional
layers. Striding reduces feature map resolution by applying the convolution kernels
at small intervals in each spatial dimension (in this study, the kernel interval is 2 in
most layers, see Table 4.3). We avoid the more commonly used pooling operation
(which reduces feature map resolution by taking the maximum or average over a
small window of values) because it discards spatial information that is relevant for
image reconstruction.
The net is trained in advance, in unsupervised fashion, using the mean squared
error (MSE) over the input and output image. To pre-train the fCAE for general use,
we select footage that covers the full color spectrum well, and has a good variety of
visual elements. As is often the case with CAEs, reconstruction is somewhat grainy
and shows some loss of fine detail. There is also a tendency to exaggerate edges.
Figure 4.14 shows a side-by-side comparison of input and output images.
Table 4.3 Convolutional autoencoder architecture

Layer IN E1 E2 E3 B D3 D2 D1 OUT
Feature 3 10 10 10 10 10 10 10 3
maps
Convolution Down Down Down Down Up Up Up Up
Kernel size 5×5 5×5 5×5 5×5 5×5 5×5 5×5 5×5
Stride 1 2 2 2 2 2 2 1
Fig. 4.14 Example of a video frame (left) and its reconstruction by the fCAE (right)
4.3.2.2 Extracting Feature Vectors
Individual video frames typically contain a variety of objects, so recognition should

be performed at finer granularity than whole images. We extract feature vectors
for a subset of image pixels (called focal pixels below) arranged in a regular grid.
The internode distance of the grid is a system parameter, which should be set in
consideration of the image resolution and available processing power. It was set to
16 in the experiments discussed here.
Because the fCAE is fully convolutional, each layer has 3D spatial structure
(x, y, feature map). For any given focal pixel, we can obtain a feature value from
any given feature map in the net as follows: We normalize the pixel coordinates
and the coordinates of the neurons of a layer into the unit square, and find the
neuron nearest to the pixel. We then apply max pooling over a 5x5 window centered
on this neuron. By doing this for all feature maps in a layer, we obtain a vector
of feature values. We normalize the feature values by dividing by the mean over
this vector. Feature extraction by means of CAEs often uses only feature values
sourced from the bottleneck layer, but restricting ourselves to the bottleneck layer is
not necessarily ideal. Under the assumption of perfect reconstruction, lower layers
contain no information that is not also latently represented in the bottleneck layer,
but this assumption does not hold in practice. Moreover, the representation in lower
layers is different. It is less abstract and less compressed. We found it helpful to gather
feature vectors from all hidden layers up to and including the bottleneck layer. We
then concatenate all vectors for a given focal pixel into a feature vector of length 40.
The feature vector of each focal pixel is determined by a limited region of the
input image, the size of which can be computed from the fCAE architecture. For our
settings, this “field of view” is 68 by 68 pixels. Note that this is over 4 times larger
than the spacing between focal pixels, meaning that the regions for feature vectors
of neighboring focal pixels substantially overlap.
4.3.2.3 Target Definition
The recognition system is accompanied by a UI with recognition query definition

functionality. Targets can be defined from various sources, such as still images, live
footage, and video files. In practical use, we expect still images to be the most
common source. As an example, given the ubiquity of cameras and communication
services, friends or relatives may be able to supply a snapshot of a missing individual
from shortly before disaster struck. In case video or multiple images are available,
more robust queries may be construed by including multiple angles in the target
definition.
Figure 4.15 shows the UI. When the system is run, the main window shows the
video feed (be it live or a file). To define a query from an image, the user clicks the
“load image” button and selects the file via a standard file opening dialog. The main
window then shows the image. When working from video, the feed can be paused at
any time to obtain a frame, which is then processed as a still image.
162 K. Ohno et al.
Fig. 4.15 Snapshot of the

target definition UI showing
the selection of a hat
By passing the image through the fCAE, we obtain a grid of feature vectors
as explained above. We divide the image into small non-overlapping square regions
(patches below), each centered on a focal pixel and associated with the feature vector
of the focal pixel.
The user places the mouse cursor on an image region belonging to the target,
and then uses the mouse scroll functionality or keyboard arrow keys to create a
selection. Scrolling (pressing) up grows the selection area, and scrolling (press-
ing) down shrinks the selection area. The unit of selection here is one patch. The
initial selection consists of only the patch at the cursor location (the root patch).
When growing (shrinking) a selection, patches are added (removed) in order of the
associated feature vector’s proximity (in feature space) to the feature vector associ-
ated with the root patch. This selection strategy makes it easy to quickly select regions
of similar color and texture, as selections typically grow to fill visually homogenous
areas before extending to dissimilar areas. The system allows making multiple selec-
tions by simply moving the cursor to a new location and growing a new selection
there. Selections can be discarded by right clicking in the root patch (or shrinking
the selection to size 0). When the user confirms the selection (by pressing the run
query or save query button and entering a label for the target), the system generates a
query (see next section) and runs or saves it. Aside from query recall at a later time,
saving queries is particularly useful for sharing targets between different instances
of the system.
4.3.2.4 Query Generation and Recognition Algorithm
When the user defines a recognition target, this definition internally consists of a set
of feature vectors. To obtain a concise target representation, we perform clustering
on this set using the OPTICS hierarchical clustering algorithm [22]. We find the first
slice of this hierarchy that marks half or more of the feature vectors as belonging to a
cluster (the remaining vectors are considered noise). For each cluster c in this slice, we
compute its mean vector cmean and a range vector crange . The range vector consists of
the distances between the minimum and maximum values for each vector element,
divided by 2, over all vectors in the cluster. The set of cluster means and ranges
provides a concise target representation. It is stored along with the user-provided
label as a query.
Next, we discuss how we determine the likeliness of a query target appearing in a
given video frame. We compute match scores for the feature vectors from the frame
with respect to the query. The match score combines two elements: a local proximity
score measuring the distance of each feature vector to its nearest cluster (accounting
for cluster ranges), and a frame-level context score measuring the extent to which the
query’s set of clusters is represented in the frame. The local proximity score between
a cluster c of the query and a feature vector f extracted from the frame is computed
as follows:
distc,f = mean max(0, | f − cmean | − crange ) . (4.1)
The context score for a frame producing feature vector set F w.r.t. a query charac-
terized by cluster set C is computed as follows:
1
ctxC,F = min{distc,f |f ∈ F}. (4.2)
C c∈C
The context score is small when most or all clusters find a proximal feature vector
in F. The match score for a feature vector f ∈ F w.r.t. a query characterized by C is
then given by:
matchf ,C = 1 − ctxC,F − min{distc,f |c ∈ C}. (4.3)
The context score makes it possible for locations across the image to boost one
another’s match score. Match score computation is highly parallelizable, so this
process is run on the GPU. Given match scores, we can obtain binary recognition
results by simple thresholding. However, relative match scores themselves are of
more practical use for selecting relevant results.
4.3.2.5 Recognition Modes
The system provides two recognition modes: forward recognition and backtrack
recognition. When a query is run, the user is presented with a prompt to select which
mode to use.
Forward recognition here refers the common style of recognition where recogni-
tion is performed chronologically over the video feed. In our system, this amounts to
applying feature extraction on the frame, followed by computation of the matching
scores with respect to each active query. Results are then optionally sent to the main
UI of the robot platform. On the laptop specified in Table 4.4, processing footage
at a resolution of 480 by 360 (CRC) or 480 by 300 (ASC), the system performed
recognition at a rate of approximately 12 fps.
164 K. Ohno et al.
Table 4.4 Specifications of Model Lenovo P71

laptop PC used in field test
CPU Intel Core i7-7700HQ,
2.8GHz, 8cores
GPU NVIDIA QuadroP3000
RAM 16GB
Backward recognition refers to recognition “backwards in time,” i.e. recognition

of targets in video frames seen before the query was run. To make this possible, the
system records the feature vectors extracted from each frame, storing them to the
hard-disk. When a query is run in backtrack mode, the stored feature vectors are
loaded and match scores are computed for batches of frames. The match score calcu-
lation is very lightweight compared to feature extraction, and can be performed for
many frames in parallel on the GPU, making this backtrack search highly efficient.
For a typical system configuration and query, computing the match scores for back-
track recognition took 7.5 ×10−4 s per frame on average, amortized, on the laptop
specified in Table 4.4. A selection of results is then sent to the main UI.
4.3.2.6 Result Selection
We select which results to send to the main UI of the platform by finding local peaks
in the time-series of the per-frame maximum match score that exceed a minimum
threshold, are at least 3 s apart (measured by the time-stamps of the frames in the
case of backtrack recognition), and are separated from the preceding and subsequent
peaks by sufficiently deep score dips. This logic is designed to limit the number of
results sent and to avoid sending near-duplicate results (which naturally occur on
subsequent frames, in particular with slow-moving robots).
For the selected results, we determine bounding boxes as follows. We mark each
image patch whose associated match score exceeds 0.99 times the maximum match
score in the frame, and then find the bounding boxes that encompass each connected
region of marked patches. This threshold setting was found to be rather conservative,
generally producing bounding boxes smaller than the actual targets. However, while
well-fitted bounding boxes are visually pleasing, we found that they add little practical
value over a simple marker placed at the location of the highest match score.
4.3.2.7 Platform Integration
Integration with the CRC consisted in implementing functionality for receiving video
and GNSS data from the suit and uploading result data for retrieval by the main
UI. Video was received by direct streaming over a WebRTC connection. The CRC
uploads its GNSS coordinates to an AWS database at a rate of 1Hz. The recognition
system retrieves the coordinates and obtains approximate coordinates for individual
video frames by linear interpolation. Results are again uploaded to AWS. Each result
image is accompanied by a metadata entry, specifying its time-stamp, coordinates,
maximum match score, bounding box coordinates, the label of the target, and an
identifier indicating the suit from which the video originated. The main UI of the
CRC retrieves the result data and marks the result locations on a map of the search
area. Clicking these marks brings up the relevant result. The UI also provides a list
view of the recognition results, ordered by match score. When a result is selected,
the corresponding location on the map is shown.
We also integrated the system with the Active Scope Camera (ASC) platform.
As this is a wired system, communication is simpler. All video and result transfer is
performed via ROS (Robot Operating System). Also, instead of GNSS coordinates,
direct measurements of the robot positions are recorded, simplifying localization of
the results in space. The result display is similar, except here results are localized in
3D space.
4.3.3 Field Test and Results
We tested the system on location at the Fukushima Robot Test Field in Japan. The
field test with the CRC focused on backtrack recognition. We placed three targets
(a hat, gloves, shoes) along the course set out for the CRC to run. After the CRC
completed the relevant sections of the course, we ran queries for the three targets in
backtrack recognition mode. For each query, we set the system to upload the three
results with the highest match scores, to allow for a second and third guess in case
of misrecognitions. For all three queries, the #1 result correctly identified the target
(Fig. 4.16). However, in each case the bounding boxes covered only part of the target.
As results for a given target are a minimum of 3 s apart and the targets were only
visible for less time than that, the remaining results were necessarily spurious. The
conditions in this experiment were fairly forgiving: the targets were clearly visible
against a homogenous background. However, the video stream suffered substantial
compression damage that was absent from the images used to define the queries,
leading to visual discrepancy between the target definition and its appearance during
the field test.
At the same occasion, we also performed a field test with the ASC platform, using
forward recognition. Here we used a single, but more complex target. The target was
the uniform of a worker, and consisted of a jumpsuit, gloves, and kneepads (Fig.
4.17, left panel). The outfit was placed in a structure mimicking a collapsed building,
which the ASC explored. The lighting conditions in this experiment turned out to
be very challenging. At the time of the experiment, bright sunlight fell into part
of the structure. This produced a combination of overexposed and underexposed
areas in the video feed. Whereas each individual element of the outfit presented a
notably different appearance under these conditions than in the image used for target
definition, their combination still elicited the highest match scores over the course
of the experiment. Thus, the top-ranked result at the end of the experiment correctly
166 K. Ohno et al.
identified the target (Fig. 4.17, right panel). Here too, the bounding box covered only
part of the target. This result illustrates the benefits of including the context score in
the recognition algorithm.
Fig. 4.16 Successful backtrack recognition of three targets (clockwise from top-left; gloves, hat,
shoes) in live footage from the CRC. The CRC is visible on the right side of each image
Fig. 4.17 Left: Target definition of a worker uniform (jumpsuit, gloves, and kneepads). Right:
successful detection of the uniform in live footage from the ASC in a simulated disaster setting
under challenging light conditions
4.3.4 Discussion
In this section, we presented an image recognition system for use in search activities,
providing basic recognition abilities with high flexibility and minimal preparation
time. We discussed its integration with the CRC and ASC platforms, as well as field
tests performed with the integrated systems. Whereas the experiments reported here
are small in scale, we expect the system to be most useful in situations where multiple
search teams are active simultaneously, and new search targets become known over
the course of the mission. In such situations, it can become challenging to allocate
sufficient human attention to scan all footage for the relevant targets. In particular,
retracing where a new target may have been visible in past footage is costly in terms
of time and attention. The backtrack recognition functionality of the system could
prove particularly valuable here.
Directions for future work are as follows. We aim to further refine the recognition
algorithm. Whereas the use of local and global scores has proven effective, the way
that they are currently combined is decidedly ad-hoc. With regards to result display,
recognition results plotted on the map currently indicate locations from which the
target is visible rather than the actual object location. Integration of orientation data
and object distance estimates could be used to mark approximate object locations
instead, which would be more intuitive. We further aim to improve UI integration
by setting the system UI up as a module that can be launched from the main UI of
each platform. It would also be useful to allow the UI and recognition system to
run on separate computers, so that the system could be operated from a tablet or
lightweight laptop. Lastly, functionality for quickly sharing queries across instances
of the system would help facilitate multi-robot searches.
4.4 Real-Time Estimation of Dog Emotion
4.4.1 Introduction
One of the most influential elements of SAR dog performance is the emotional state
of the dog. If the dogs become fatigued, the handlers need to let them rest and use
other dogs [39]. If a SAR dog cannot maintain the motivation to search, that dog
should be replaced with others due to time constraints. Whether the SAR dog should
be replaced or not is determined based on the experience of the handler visually
observing the behavior of the SAR dog. However, the searching task is not always
conducted within the visible area from the handler, and the dogs occasionally work
in hidden areas. Therefore, to support the handlers make the appropriate decision,
real-time and distal visualization of the internal emotional state of the SAR dogs is
necessary.
The real-time emotion estimation system for SAR dogs needs to be able to: (1)
objectively monitor emotion, (2) estimate and visualize via a GUI in real time, and
168 K. Ohno et al.
(3) simultaneously visualize the location of the SAR dogs. (2) and (3) are described
in the other parts of this chapter; we focused the real-time emotion estimation system
in this section.
The remaining part of this section is organized as follows. Section 4.4.2; Emotion
and heart rate variability. There are many reports describing the relationship between
emotion and hear rate variability (HRV), which is calculated based on the variability
of beat-to-beat intervals and reflects the valance and the power of sympathetic and
parasympathetic nerve activities. We review these studies and show some evidence
in dogs. Section 4.4.3; System component, which include electrode fixing, electro-
cardiogram (ECG) data, which is a measurement of the electrical activity of the
heart muscle and nerve collection, and a communication device. Section 4.4; Emo-
tion estimation based on HRV. Emotion estimation system uses a variation of heart
beat intervals. By calculating the time-domain indices of HRV, emotional states (pos-
itive/negative) are estimated by a machine learning method. Section 4.4.5; Develop-
ment of real-time emotion estimation system. Use of the real-time classifier enables
us to estimate the emotion of the SAR dog in real time. A real-time emotion esti-
mation system is developed and is evaluated using house dogs and SAR dogs in an
outdoor environment.
4.4.2 Emotion and Heart Rate Variability
Emotion is a central nervous system function and changes in response to external

stimuli, which can lead to adaptive reactions [31]. For example, the threat from a
predator induce freeze/free/flight response in mammals. In parallel with the behav-
ioral changes, responses in blood pressure, heart rate, and sweating are observed.
Thus, emotional state can be quantitatively evaluated, which express in autonomic
nerve system, endocrine system, tension of muscles, and behaviors of animals includ-
ing humans [10]. The basic emotion can be classified into two axes, the positive-
negative and calm-arousal axes [42]. Each classification of the emotion is suggested
to be accompanied by a distinctive autonomic nerve system activity. The relationships
between the classification of emotion and the status of the autonomic nerve system
are extensively discussed in humans [10, 26], but no clear conclusion is obtained
due to the complexity of emotions in humans.
Heart beat intervals (R-R intervals, RRI) are not stable, they contain fluctuations.
Heart rate is controlled by both the sympathetic and parasympathetic nerve system,
and the balance of these two systems can determine the RRI. RRIs contained vari-
ability, and are not constant; therefore, several indices are used as parameters for
heart rate variability (HRV) [2]. The Root Mean Square of the Successive Differ-
ences in RRI (RMSSD) reflects the beat-to-beat variance in HR and is the primary
time-domain measure used to estimate the vagally-mediated changes reflected in
HRV. In contrast, both sympathetic and parasympathetic nerve systems contribute
to the mean of the standard deviations of RRI (SDNN). Therefore, HRV parame-
ters are useful indicators for measuring the activity of the autonomic nerve systems
influenced by the emotional state [29]. In animals, there are many studies for the
association HRV with negative emotion from a view point of animal welfare [37]
Recently, some studies that indicate the relationship positive emotional state are also
associated with HRV [3, 37]. In dogs, we observed that the influence of the emotional
change can be detected in HR and HRV [24]; namely negative and positive emotion
are associated with a decline of RMSSD and a decline of the SDNN. These data
indicate that RRI and HRV parameters are useful for understanding the balance of
sympathetic and parasympathetic nerve systems.
Interestingly, in laboratory rodents, HRV is reflecting the reward anticipation,
indicating that not only positive emotion (relax or comfort) but also motivation
(expecting the rewards) can be detected using HRV parameters [21]. This suggests
that HRV is useful for feeding the anticipation of a SAR dog when searching for and
finding the target odor, as well as increasing motivation.
Parasympathetic nerves have rapid changes, such as less than 1 second, while
sympathetic nerves have slower changes like more than 5 s [1]. Because these divi-
sions can produce contradictory actions, like speeding and slowing the heart, their
effect on an organ depends on their current balance of activity. For the evaluation of
the emotional state of a SAR dogs, time-domain analysis of HRV is useful because
this method can distinguish the emotional state of a dog using 15 s of heart rate data
[24]. We use the HRV time domain analysis to estimate the emotional state of SAR
dogs, which ceaselessly changes moment by moment.
4.4.3 System Components
(a) Electrocardiogram (ECG) suit equipped to a canine

We have developed a new suit equipped with a small electrocardiogram (ECG) device
and a communication device that collect data online and send them to a data server
[19]. The electrodes (Nihon Kohden Vitrode M) were placed on the three points on
the body of a dog using M-X lead; this ensured stability for movement and the suit
supported the fixing of electrodes to the dog’s body by pressure. The abdomen of
the dog is covered with elastic cloth sewed on the base. This elastic cloth holds the
electrodes to the body of the dog (Fig. 4.18).
The ECG device is based on the previous system [45]. The raw signal with sam-
pling frequency of 1000 Hz detects R waves through the adaptive threshold of a
low-pass filtered signal. In brief, the ECG measured by three electrodes (+, −,
GND) is amplified and conditioned in the analog frontend. The first-order variable-
gain active low-pass filter (LPF) was designed to reduce the high-frequency noise.
The gain is defined by the resistor ratio R2/RC, where RC is the resistor value of the
variable resistor. Here RC is controlled by the automatic gain control program that is
embedded in the microcontroller. Though this LPF may reduce the amplitude of the
R waves, a sufficient amplitude is obtained because it is adequately amplified due to
automatic gain control.
170 K. Ohno et al.
(a)
(c)
(b)
Fig. 4.18 a the “cyber-enhanced rescue canine suit” for SAR dogs. The suit is equipped with
a small ECG device, accelerometer, GNSS, WiFi router, and battery. The weight of the cyber-
enhanced rescue canine suit is less than 1.2 kg, which is light enough to wear for a medium- and
large-sized dogs. b The electrode placement with M-X (M: Manubrium sterni region of the 2nd
thoracic vertebrae. X: Xiphoid process) lead in order to minimize the movement of the dogs. c
Schematic drawing of the electrode fixing. The electrode was covered with a sponge to increase the
pressure on it. The sponge is also pressed by the fixing belt placed around the body of the dog
The inertial sensor, Xsens MTi-G-710, measures acceleration data and the latitude
and longitude of the position of the dog. The latitude and longitude are then uploaded
to the data server. The ECG device, a single board computer, Raspberry Pi 2 Model
B, IMU, GNSS antenna are aggregated and loaded on a developed ECG suit. The
power of the devices is supplied by batteries loaded on the suit. The weight of the
ECG suit is less than 1.7 kg and light enough for a medium- or large-size dog to
wear.
(b) Data processing
Each time point of the R wave peak is detected and sent to a single board computer,
then RRI values are calculated. The RRI values are regularly sent by a mobile phone
network and stored in a NoSQL data server DynamoDB provided by AWS. In the
calculation server, the HR data are processed to calculate the time-domain indices of
HRV, to estimate the emotional state of a dog [19]. The data are calculated and the
results of RRI and estimated emotional state are visualized in the GUI in anywhere
(Fig. 4.19)
Cyber-enhanced Canine Suit
Wifi router
Fig. 4.19 Overview of the current system for real-time emotion estimation. Cyber-enhanced rescue
canine suit is equipped with accelerometer and ECG devices and the data are transmitted to computer
(Raspberry Pi2). The RRI values are calculated in the computer and sent to the data server via WiFi.
The estimation calculation is conducted in Calc. server and the results; HR and emotion estimation
and reliability) are visualized in GUI
4.4.4 Emotion Estimation Based on HRV
Initially, to investigate whether emotional estimation is reliably calculated by HRV,

the RRIs under resting or walking conditions were measured. The RRI values were
obtained from two systems; one was the system ECG device used in this system
and the other was a TSDN122 device, which was equipped with offline raw data
acquisition. The subject is a male Standard Poodle, and one trial is measured for
each condition, resting or walking. R waves are detected on measured raw ECG
data by using a function Findpeaks of MATLAB, and the detection is validated by
visual inspection of human. As result, the R waves are stably detected in resting
and walking conditions, due to the suit and electrode fixing methods (Fig. 4.20).
When the dog ran quickly, the R waves were sometimes difficult to distinguish
because the contamination of myopotential. There is almost no bias between the two
measurements and the standard deviation was small. These results demonstrate that
the device equipment are good enough for measuring RRI in moving dogs [19], but
false estimation may occur when the dogs are highly active.
As mentioned above, an evaluation of an emotional state that constantly changes
should be completed within a short time period. Frequency domain analysis is useful
for the analysis of HRV, evaluating the balance of the activity of sympathetic and
parasympathetic nerve systems, and it requires a time window of several seconds
[24]. In this study, the time-domain analysis of the HRV is used to classify emotional
state. We recorded dog behaviors and HRVs under a baseline condition with no stim-
uli to the dogs and two conditions with positive or negative stimuli. In the positive
condition, the dog followed the handler and the handler gave him treats eventually
(positive reinforcement). In this condition, the dog would estimate the reward from
the handler and continued to follow the handler’s command. In the negative con-
dition, the handler did not give him treats as a reward for following the handler’s
command (negative punishment). In this situation, the dog finally stopped following
172 K. Ohno et al.
Fig. 4.20 Example of offline data collection and analysis of emotion estimation. a the picture
shows the data collection. b the GNSS data are plotted onto Google Maps. c the results of offline
estimation are visualized on the corner (red line; HR. Blue line; Estimated emotional state (Top;
positive, middle; neutral, bottom; negative.))
Table 4.5 Five indices of HRV

Index Calculation
Mean RRI Mean values of RRIs
SDNN Standard deviation of RRIs
Total power Variance of RRIs
RMSSD Root mean square of differences between adjacent RRIs
NN50 Counts of differences between adjacent RRIs bigger than 50 ms
the commands, lost the motivation, and paid attention to the other stimuli rather than
the task. Thus, these two conditions can be considered similar to the searching task.
We investigated the relationship between emotional categories and the HRVs using
the time-domain analysis. We have demonstrated that emotional states can be ana-
lyzed in the time-domain analysis with a time window of 15 s [24]. Five indices of
HRV, namely, mean RRI, SDNN, Total power, RMSSD and NN50 (Table 4.5) are
adopted for the estimation.
A classifier of emotional state is constructed using a supervised learning method of
machine learning [19]. A random forest method, which is one of the ensemble learn-
ing methods, is adopted for this classification. Random forest learning is achieved
by bootstrapping a training dataset. The random forest classifier is trained using the
five indices of HRV measured on conditions of positive or negative emotional states.
For hyperparameter tuning, 80% of the data are used for parameter learning and
the rest are used for validation of classification accuracy. The hyperparameter and
corresponding classifier that gives the best performance for the validation data are
used for a classifier for the estimation of the emotional state of the dog [19].
In addition to these calculations, the abovementioned fast running data indicated
that RRI values during this type of movement can contain errors (Fig. 4.21). There-
fore, we set a threshold for the fast running using acceleration data by machine
learning, and not using these RRI data for estimation.
The classification accuracy of emotional states that included stopping, walking and
slow running conditions was 74% for validation data. Notably, adding acceleration
data improved the accuracy (Table 4.6). This result shows that the positive or negative
emotional state of the subject can be classified offline by using time-domain indices
of HRV (Fig. 4.20).
Fig. 4.21 ECG data obtained by the cyber-enhanced rescue canine suit. While the dog is resting,
or walking, clear R peaks are easily detected; however, if the dog is running, the R peaks and EMG
data are merged, and it is difficult to detect the ECG R peaks
Table 4.6 Classification accuracy of emotional states

HRV Acceleration HRV+Acceleration
Resting 74.0 63.4 66.5
Moving 58.6 90.6 89.9
174 K. Ohno et al.
4.4.5 Development of Real-Time Emotion Estimation System
Our proposed real-time emotion estimation system consists of following elements: (i)
measurement of heart beat intervals, (ii) calculation of indices of HRV, (iii) emotion
estimation using a machine learning method, and (iv) visualization of heart rates and
emotional states in GUI [19]. As a continuation of research, the online measured
heart rates are evaluated to be consistent with an offline measured reference obtained
by Bland-Altman analysis.
In the behavioral experiment, the subject is the same as in the offline estimation
experiment, and a classifier of the emotional state is trained in the same experiment.
No specific task is imposed, and the dog stands, or follows the handler with (positive
condition) or without (negative condition) receiving a treat from the handler. RRIs
are measured and uploaded to a data server by using a developed ECG suit.
A calculation server receives the measured RRIs from the data server, and the
time-domain indices of the HRV are calculated in real time by a calculation server
that continuously obtains the most recent RRIs from a data server. The length of the
time window to be analyzed is approximately 5 s in this study. The calculation server
obtains the most recent eight RRIs to calculate the indices, because the RRI values
of a canine in a rest condition are usually in 0.6–1.0 s range. The random forest
classifier also outputs the probabilities for positive or negative classes. As the final
decision of the estimated emotional state, if either output probability is less than 0.7,
the emotional state is considered to be a neutral state.
Heart rates (inverse of RRIs) and emotional states are visualized in real time, in
the GUI. A graph of the heart rates of dogs are drawn by using Chart.js library of
Javascript and are updated every one second. A graph of the estimated emotional
state is drawn using Matplotlib library of Python and is updated every five seconds.
The heart rates of the dog can be monitored in a web browser, and its position can
be drawn on Google Maps.
The proposed system worked in an outdoor environment to measure RRIs online,
and to classify the emotional state of the canine in real time based on time-domain
indices of HRV (Fig. 4.22). The canine position on Google Maps, heart rates, and
the estimated emotional state are visualized. This result confirms that the proposed
system can monitor the position, heart rates, estimated state of positive or negative
in real time.
Moreover, we test one SAR dog in this system. The learning data were obtained
in the training session in which the SAR dog searched a wide practice area with
(positive) or without (negative) putative victims hidden in a hole/broken house. Then,
the SAR dog was tested again in a different practice area. When the SAR dog is
approaching the area in which a victim is hidden, the emotional estimation became
“positive”. These results suggest that this real-time emotion estimation system can
be useful in an actual disaster scenario.
Fig. 4.22 Example of the data collection and analysis of emotion estimation in real time using a
SAR dog. a the picture shows the data collection. b the GNSS data are plotted onto Google Maps.
c the results of online estimation are visualized on the corner (red line; HR. Blue line; Estimated
emotional state (Top; positive, middle; neutral, bottom; negative.))
4.4.6 Discussion
SAR dogs play an important role in searching for victims in disaster sites. The
efficiency of the search is dependent on the performance of the dogs, and their per-
formance is related their internal states, such as motivation and fatigue. However,
their internal states cannot be observed when the dog is in a distal position. We
developed a real-time emotion estimation system for SAR dogs based on electrocar-
diography signals. The intervals of the heart beats of the dogs are monitored online
by the “cyber-enhanced rescue canine (CRC) suit” equipped with an electrocardio-
graphy device and 3-axis acceleration are transmitted via WiFi [19]. Time-domain
indices of heart rate variability are calculated and used together with 3-axis accel-
eration data as inputs to an emotional state classifier. Emotional states are classified
into three categories, i.e., “positive,” “neutral,” or “negative.” The measured intervals
and estimated emotional state are visualized in GUI in real time [19]. This system
is evaluated during the training sessions of the SAR dogs, and it was operated in
real time. This new system can enhance the efficiency of searching missions of SAR
dogs.
Our newly developed system worked in the same dog. In other words, if parameter
learning is conducted using the HRV data from one specific dog, the accuracy of
emotion estimation in the same dog can be high as 90%. In the future, there are two
concerns to be resolved.
(1) Generalization of the calculator in SAR dogs. One estimation calculation
obtained from a specific SAR dog can be generalized to other SAR dogs. In this
176 K. Ohno et al.
regard, the developed SAR dogs should be adapted to other SAR dogs under two
conditions, one is positive (the target odor/human present) or negative (the target
odor/human absent) and the accuracy should be calculated. If the accuracy is not
high enough, the parameter learning data should be obtained by multiple SAR dogs,
and the estimation calculator is adopted to multiple SAR dogs. If the calculate the
accuracy is high, one standard calculator can be used in the future. If not, this system
requires parameter learning data from each SAR dog.
(2) Generalization of the calculator in other service dogs. In addition, a real-time
monitoring system is useful and increases the efficiency of other service dogs. For
example, chemical detection dogs, such as sug-detecting dogs or quarantine detector
dogs. These dogs are trained to search for the target odor, similar to SAR dogs, so
our system can be similarly useful for them. Guide dogs for blind people, or hearing
dogs are in a different way of training, so the parameter learning need to be replaced,
but the main stream of this system could be useful. In either case, “the status of
being perplexed” is one of the most important emotion in service dogs. In animal
studies pertaining to rats, metacognition can be detected in behavioral experiments,
therefore, it is possible that the animals have a specific emotional status related to
“being perplexed”. Estimating “being perplexed” is a way to analyzed/calculate in
future.
4.5 Estimation of SAR Dog Action
4.5.1 Objective and Contribution
A system that records the actions and the condition of SAR dogs and visualizes them
for the remote handler is necessary to enhance their search activities of SAR dogs.
Therefore, we developed a system that estimates the actions of a dog and presents
the result to the remote handler in real time. Our system accurately estimates the
actions of a dog using a machine learning method in real time given the limited
computational resources available on the CRC suits.
4.5.1.1 Related Work
With the current spread of inertial sensors, which are very small, lightweight, cheap,
and energy efficient, bio-logging applications have become popular over the last
decade. There has been some research on analyzing the behavior of dogs based on
the inertial data collected using the sensors. Gerencér et al. [18] proposed a system
to classify seven activities (stand, sit, lay, walk, trot, canter, and gallop) of a dog
using inertial sensors by support vector machine. Additionally, den Uijl et al. [8]
validated that the detection of eight behavioral states (walk trot, canter/gallop, sleep,
static/inactive, eat, drink, and headshake) from accelerometer data is useful from a
veterinary perspective for early detection of diseases. Ledha et al. [28] presented an
algorithm that counts the number of steps and estimates the distance traveled by a
dog from accelerometer data. However, to the best of our knowledge, no systematic
research exists addressing the classification of activities of SAR dogs.
4.5.2 Materials and Methods
4.5.2.1 System Architecture
Figure 4.23 shows the configuration of our system for estimating the actions of a
dog. We use a CRC suit [25, 35, 44] to obtain and process data on the activities
of the dog. The CRC suit includes sensor devices, a Raspberry Pi processor, and a
mobile WiFi router. The sensor devices include a camera, a microphone, IMUs, and
a GNSS receiver.
First, the system records the activities of the SAR dogs using the sensor devices
and transmits it to the Raspberry Pi processor. From among the data obtained from
these sensors, we use the acceleration (X, Y, and Z axes), the angular velocity (X, Y,
and Z axes), and the posture (yaw, pitch, and roll axes) data to estimating the actions.
The action estimation for the dog is performed in real time using the machine
learning system installed on the Raspberry Pi processor. Next, the obtained estimation
Fig. 4.23 Overview of the system for estimating and visualizing a dog’s action
178 K. Ohno et al.
result is transmitted to the AWS cloud database via the 3G/4G mobile phone network
using a mobile WiFi router.
Finally, the visualization system obtains the action estimation result from the
AWS and presents it to the handler in real time. Because the estimation result is
transmitted to the visualization system after being routed through the AWS, the real-
time property of the visualization is impaired because of transmission and processing
delays. However, by doing so, the reliability of the storage of the estimation result
is ensured and its use in other systems is facilitated.
4.5.2.2 Machine Learning System
The system estimates the actions using acceleration and angular velocity information
measured by the IMU. The actions are labeled as bark, run, sniff_object, sniff_air,
stop, and walk, in which multiple class labels can be used at the same time. Therefore,
we consider this task as a multi-label classification problem. Because it is difficult to
directly handle this type of task, we decomposed it into a number of binary classifi-
cation problems using a technique called binary relevance [41]. In binary relevance,
for each class label, a binary classifier that estimates whether the data belongs to the
class is built. Then, labels whose classifiers output a positive estimation are used as
the overall output.
The data obtained by the IMU are linearly interpolated at 200 Hz and converted to
an amplitude spectrum using the short time Fourier transform (STFT). Subsequently,
it is used to estimate the actions at the center of the STFT window. In the STFT, we
use a hamming window with a size of 0.64 s and a shift of 0.32 s. Considering the
trade-off between real-time performance of estimation and computation resources,
preprocessing and estimation are executed for a batch every 0.96 s. The size of the
batch is usually three. The size can be increased to eliminate delay in estimation if
any occurs. The output is the probability for each of the actions. This output is then
sent to the cloud database.
There are many methods available to address the binary classification problem,
such as support vector machines [20], neural networks [9], random forests [4], and
gradient tree boosting [16]. In the proposed system, the binary classifier must perform
its task as fast as possible on the Raspberry Pi processor to achieve good real-time
performance. Additionally, its estimation accuracy needs to be high in order to be
useful for the dog handlers. Therefore, we use gradient tree boosting and its fast
implementation scheme known as XGBoost [6].
Gradient tree boosting is a state-of-the-art method for standard classification and
some other tasks. It is a decision tree ensemble method. The ensemble model is an
additive functions of regression trees (also known as decision trees). The trees are
sequentially built and added to the ensemble. Each tree is optimized to improve the
ensemble using first and second order gradient statistics on the objective function.
An intuitive description is provided in Fig. 4.24.
Gradient tree boosting has various hyperparameters. In particular, the learning
rate and the number of trees have a significant influence on its behavior. The learning
Fig. 4.24 A tree ensemble model and its prediction for given examples
Table 4.7 The Parameter name Value

hyperparameters
of XGBoost Objective “Binary:logistic”
Subsample 0.7
colsample_bytree 0.7
max_depth 4
max_delta_step 1
rate controls an impact of each individual tree. In practice, a smaller learning rate
tends to yield better performance. However, a smaller learning rate means that more
trees need to converge. As a result, more computation resources and execution time
are required. Therefore, we set the learning rate to 0.1, which is slightly higher than
what is used in data science competitions, where accuracy is the most important
factor.1 We also need to tune the number of trees because too many trees can cause
over fitting. Although we fixed the other hyperparameters, the optimal number of
trees varies depends on the task. Therefore, we determined the number of trees for
each label using cross validation. The other hyperparameters of XGBoost are listed
in Table 4.7, which are applied to all labels. The parameter objective determines
the objective function. We use cross entropy, which is typically used in standard
binary classification. The parameters subsample and colsample_bytree determine the
subsample ratio of the training instance and columns, respectively, when constructing
1 For
example, 0.01 is used in the winning solution of the Porto Seguro’s Safe Driver Prediction
competition hosted by Kaggle (https://www.kaggle.com).
180 K. Ohno et al.
each tree. The parameter max_depth determines the maximum depth of each tree.
The parameter max_delta_step is a value related to regularization of the weight of
each tree. If it is set to a positive value, it can help in handling unbalanced data.
To economize on computation resources, we used the low frequency part of the
amplitude spectrum. We confirmed in the preliminary experiment that it does not
affect the estimation accuracy.
4.5.2.3 Visualizer for Remote Handlers
We developed a visualizer to present the trajectory, actions, and viewpoint video of

a dog to the remote handler in real time. A screenshot of the visualizer is shown in
Fig. 4.25.
The left part of the screen shows the trajectory and actions of the dog. The colored
trajectory shows the length of the path where the dog walked or ran, and the pins
identify occurrences of corresponding actions for a given position. All pins and
corresponding actions are also shown in Fig. 4.25.
The areas indicated on the map are the exploration areas. The right part of screen
shows the video from the camera attached to the CRC suit. The information shown
in this figure was updated in real time.
Fig. 4.25 Visualizer for the remote handler and pin icons of the corresponding actions
The user can watch multiple dogs on this visualizer. The tabs on top of the videos
can be used to switch between videos. The trajectories of dogs can be turned on/off;
hence, the handler can see multiple trajectories at the same time.
4.5.2.4 Visualizer for Development
We also developed another visualizer to accelerate the development and debugging

of the estimation system. A screenshot of the visualizer is shown in Fig. 4.26. The
visualizer consists of several windows. The top-left window shows the actions on
the map and the top-right window shows the same video as the visualizer for the
remote handler. The windows on the bottom-left show the probabilities for the class
labels for the most recent 30 s as a time-series for each label. All of this information
is updated in real time.
The user can flexibly expand these windows or move them to other parts of
the screen. This is made possible with the GoldenLayout library.2 Moreover, using
the Highcharts3 library, users can interactively check each estimation result. For
example, when the user places the mouse pointer at a point on the chart, the exact
value of the probability and the corresponding time-stamp are shown.
Note that this visualizer for developers is different from the visualizer for han-
dlers. The visualizer for handlers does not show the estimated probabilities. This
information is presented only using color-coded trajectories and pins at the location
where the actions are detected. The visualizer for developers; however, shows the
Fig. 4.26 Visualizer for development
2 http://golden-layout.com/.
3 https://www.highcharts.com/.
182 K. Ohno et al.
exact probabilities as a time-series for each label. During field tests, this visualizer
for developers enables us to check the behavior of the system and immediately to
detect bugs in the field.
4.5.3 Experiments
4.5.3.1 Experimental Data
In this section, we describe how the annotated data for training the machine learning
model was obtained. Annotation was performed manually using videos that we had
taken earlier. We used ELAN [11, 43], a software tool, to adding annotations to the
video and audio files.
We associate the six labels to a period of time as specified below.
• Bark: The dog discovered a victim and is barking. We ignored simple growls that
were unrelated to the discovery.
• Walk: The dog is walking forward. We excluded jumping, running, or moving
backward. However, a slight backward motion to change direction is included.
• Run: The dog is running (cantering and galloping). The difference between walk-
ing and running is that while running, all four legs are off the ground at some point
while the dog moves forward.
• Stop: The dog remains at a given location for at least 0.1 s. Thus, even while making
slight movement, if the position does not significantly change, we regarded it as a
stop.
• Sniff_air: The dog is sniffing a scent/odor.
• Sniff_object: The dog is sniffing the scent/odor of an object while bringing its
nose to the source.
Among these labels, stop, walk, and run are mutually exclusive. This means that
these events cannot occur simultaneously. Similarly, bark, sniff_air, and sniff_object
are also mutually exclusive. However, other combinations are possible. For example,
walk and sniff_object are sometimes annotated for the same given time.
If the same action continues for a period of time, we regard it as a single action
without breaking it into constituent actions. Moreover, we insert a short blank
between two mutually exclusive actions such as walk and run to retain the inde-
pendence of the inertial sensor data associated with the periods of these actions.
The time-stamps of the annotated labels are synchronized with those of the sensor
data recorded by the CRC suit. This synchronization must be performed carefully
because an action and the corresponding sensor data can become mismatched with
even a small 0.1 s time-shift. This is especially true for bark, sniff_object, and
sniff_air. We reduced this task by displaying the Unix time-stamps of the Raspberry
Pi processor in the CRC suit and embedding them in the video at the start of each
trial. We then manually synchronized them based on these time-stamps.
4.5.3.2 Evaluations
In this section, we present our evaluation for the accuracy of the proposed system
with the dataset we accumulated. We used one of the 13 data samples as testing data
and the rest as training data.
A receiver operating characteristic (ROC) curve is a popular measure for evalu-
ating classifier performance. We show the ROC curves together with the area under
the curves (AUC) of the classifiers for our proposed system in Fig. 4.27a. Note that
the AUC of an optimal classifier is 1.0, and the higher this is, the better. These results
show that the proposed system successfully estimated the run, walk, stop, and bark
actions with high accuracy. However, the accuracies for the actions sniff_object and
sniff_air were low. There are two possible reasons for this.
First, the dataset that we used does not contain enough data on sniff_object and
sniff_air actions. The number of each class contained in the dataset is shown in
Table 4.8. Although there were 3226 walk labels with the largest number, there
were only 1328 sniff_object labels and 336 sniff_air labels. The accuracy can be
Receiver Operating Characteristic Curve Receiver Operating Characteristic Curve
(a) Off-line test (b) Field test
Fig. 4.27 ROC curves for (a) off-line test, and (b) field test
Table 4.8 The number of each labels contained in the dataset

Action Training data Test data
Run 2347 28
Walk 3226 181
Stop 3205 476
sniff_object 1328 31
sniff_air 336 22
Bark 1376 101
184 K. Ohno et al.
improved by collecting additional data. Furthermore, applying an additional learning

algorithm that is suited for unbalanced data may help improve the accuracy.
Second, it is difficult to classify the sniff_object and sniff_air actions accurately
using only acceleration and angular velocity measured by the IMU. An additional
hardware device, for instance a microphone array, could be used to detect breathing
sounds of the dog and would be helpful in detecting the sniffing actions more accu-
rately. However, this comes at the cost of additional weight and battery consumption.
4.5.3.3 Field Test
In this section, we will report on the performance evaluation of our system, which
was conducted in an ImPACT-TRC public demonstration at the Fukushima Robot
Test Field in June 2018. We performed a field test with the cooperation of a SAR dog
and its handlers. In the test, the task was to find a victim and lost items in a simulated
disaster site. The system was tested on a Raspberry Pi 2 processor in the CRC suit
alongside other processes, such as video-broadcasting and data-logging. Under the
simulated conditions, we verified that the system functioned as expected without any
drops in the estimation results as online. Although several short delays due to network
congestion occurred while registering the estimated result to the database, there was
no delay in the estimation itself. This is because the registration and estimation ran
on separate threads. After the demonstration conducted, we evaluated the accuracy
of the classification for the data recorded in the rehearsals before the demonstration.
The ROC curve and its AUC are shown in Fig. 4.27b. Because no run or sniff_air
actions appeared during these trials, these are omitted in the figure. This shows that
the system achieved nearly the same accuracy as the offline test.
We also tested the two visualizers described in Sect. 4.5.2.3 and 4.5.2.4. The
snapshot in Fig. 4.25 was taken at the rehearsal. The trajectories are colored in
yellow or red depending on the estimated probabilities of walk and run actions. The
red pin indicates the position where the dog barked. Figure 4.26 is a snapshot of
the visualizer for development. The visualizer performed well, and we were able to
conveniently check the behavior of the estimation system throughout the rehearsal
and demonstration.
4.5.4 Discussion
We have described our system that estimates the actions of a dog using acceleration
and angular velocity information measured by an IMU. The result shows our system
can estimate bark, run, stop, and walk actions with good accuracy. It also shows
that it is difficult to estimate sniff_object and sniff_air actions.
One factor that makes the task difficult is the availability of training data. The
sample size is too small, and positive and negative instances are too unbalanced.
Therefore, it is necessary to acquire more data and employ methods specialized to

address such unbalanced data.
Another difficulty is that the dogs that participated in the tests can be different from
the dogs whose actions were used to obtain the training data. Standard classification
methods usually assume there is no such change. It may be difficult to handle such
differences without adding labeled data of the new target dog. Even if labeled data of
the target dog is available before test time, combining these existing training data and
the new data to construct a classifier that is suitable for the target dog is a complicated
process. Recently proposed methods for few-show adaptation such as in [15] can help
to overcome this difficulty.
The use of information from other data source, such as a microphone array, is
another possible solution to improve the accuracy. This is especially valid for the
sniffing actions. Sounds of aspiration will be very useful for estimation. We would like
to integrate such devices into the CRC suit while making the necessary considerations
for the weight and electric power limitations of the CRC suit.
4.6 Technologies Under Development for CRC Suit
4.6.1 Trajectory Estimation in Non-GNSS Environments
GNSS data was used todisplay the position of a dog and its trajectories. However,
SAR dogs often perform searches in non-GNSS environments such as the inside of
collapsed buildings and deep forests, where GNSS signals are weak and the combi-
nation of visible GNSS satellites often change. In such a situation, the GNSS receiver
is not able to provide accurate position information. There are several solutions to
improve the accuracy, which include the use of visual information, IMU based iner-
tial navigation, and post-processing GNSS data. The authors have investigated these
solutions.
4.6.1.1 Visual SLAM Based Trajectory Estimation
We developed visual SLAM to estimate the position of a SAR dog in a non-GNSS

environment using a camera mounted on the CRC suit. Visual SLAM can recon-
struct the camera position, attitude, and surrounding three-dimensional shape from
camera images. Because the camera mounted on the CRC suit moves aggressively,
it is difficult for conventional visual SLAM to estimate the trajectory. Therefore,
we developed a CRC suit for visual SLAM with a high-speed camera (left side of
Fig. 4.28). A high-speed camera was mounted horizontally on a SAR dog. We also
developed a visual SLAM for high-speed cameras that can be used for the SAR
dogs’ violently shaking images. The images were processed, and the trajectory of
SAR dog was estimated during the exploration without GNSS. The right-hand side
186 K. Ohno et al.
Fig. 4.28 Cyber-enhanced rescue canine suit for visual SLAM: (Left) A camera suit, (Right)
Original camera image, point cloud data, and its trajectory
of Fig. 4.28 shows the result of visual SLAM when a SAR dog wearing the camera
suit walked around a long container. We confirmed that the three-dimensional point
cloud and the camera’s position were reconstructed from the camera images taken
by the high-speed camera. Even in a non-GNSS environment, it is possible to obtain
the trajectory of a SAR dog. Details of visual SLAM are explained in Chap. 2.
4.6.1.2 IMU and Walking Pattern-Based Trajectory Estimation
We developed a method for estimating dog velocity and position using an IMU [38].
In an inertial navigation system, because the velocity is calculated by the integration
of the acceleration obtained from the IMU and the position is calculated by the
integration of the velocity, it is necessary to cancel the cumulative error of the velocity
and position. For canceling the cumulative error of the velocity, we use zero velocity
point (ZVP), which is a well-known technique used in the analysis of human walking
motion. We analyzed the walking motion of dogs and found that ZVP can be used for
dogs. Our original method can cancel the cumulative error of the velocity and estimate
the velocity and position. Figures 4.29 and 4.30 show the trajectory estimation results
when going around on a plane and walking up and down stairs. In these cases, the
trajectory was estimated by canceling the cumulative error of the velocity estimation.
In the future, it will be used together with GNSS to estimate a more accurate trajectory.
Fig. 4.29 Dog trajectory estimation using IMU and dog gaits on flat ground
Fig. 4.30 Dog trajectory estimation using IMU and dog gaits on stairs
4.6.2 Remote Instruction of Dog Behavior
Animal behavior can be controlled using non-invasive stimuli. Sounds, vibration,

and lights are major non-invasive stimuli used to control animals such as canines,
cats, and cows. Among these stimuli, we used laser beams to control the motion of
dogs. Here, we show that the direction in which a canine moves can be controlled
using on-suit laser beams (Fig. 4.31). We found that a highly bright laser beam (1
mW) is suitable for canine motion control in an indoor environment. The brightness
of the laser beam was more important than color (Fig. 4.32), and the difference in
color (red, green, and blue) did not make a significant difference in the motion of the
canine. We could control the canine to move to the left, right, and forward directions
using three laser beams facing different directions. In our control system, a human
Fig. 4.31 Canine suit for remote instruction

188 K. Ohno et al.
Fig. 4.32 Laser beams used for dog motion instruction
Fig. 4.33 Field setup of remote instruction
operator can change the moving direction of the canine with a joypad. Our results
show that the human operator could guide the canine using the on-suit laser beams
to the place where the canine could watch the target (Figs. 4.33 and 4.34). Details of
remote instruction was described in [36].
We consider that the on-suit laser-beam-based canine motion control is a starting
point for expanding the working ability of the canine. In the near future, canines
wearing laser beam suits will explore damaged buildings and capture photos of
damage instead of humans during search and rescue missions.
Fig. 4.34 Field test of dog motion instruction: dog motion was controlled using laser beams
4.7 Conclusion
This chapter described a cyber-enhanced rescue canine that strengthens the searching
ability of SAR dogs by incorporating robotics technology. We have developed the
following new technologies for enhancing the search ability of SAR dogs:
I. Lightweight CRC suit that SAR dogs can wear for several hours
II. Retroactive searching for objects recorded in SAR dog camera images
III. Estimation of emotional state of a SAR dog from its biological signal
IV. Estimation of SAR dog behavior from sensors mounted on the CRC suit
V. Trajectory estimation in non-GNSS environments
VI. Remote instruction of SAR dog behaviors
We implemented these technologies into the CRC suits. Table 4.9 shows the
progress of implementation of these technologies. At first, we implemented only
three technologies on the CRC suit No.4: lightweight CRC suit, retroactive searching
for objects, and trajectory estimation in non-GNSS environment. On the basis of the
prototype CRC suit No. 4, we continued the development of each technology along-
size system integration. Then, we implemented five technologies on the CRC suit No.
6: lightweight CRC suit, retroactive searching for objects, estimation of emotional
state, estimation of behaviors, and trajectory estimation in non-GNSS environment.
For visual SLAM and remote instruction, special CRC suits were developed and
were used to evaluate the technologies.
We expect that the technologies of the CRC suits will support rescue workers
efficiently in Japan and around the world. Therefore, we evaluated these technologies
with real end-users such as JRDA, Japanese firefighters, and Italian mountain rescue
dogs. Based on these field tests, we improved the system and functionalities of the
CRC suit and developed a practical CRC suit and its GUI with Japanese companies.
The CRC suit has become more reliable and is ready to be used in disaster sites.
Therefore, we started lending the practical CRC suits to JRDA from July 2018. In
the near future, JRDA will use these suits in real rescue missions in Japan.
190 K. Ohno et al.
Table 4.9 Implementation of technologies to CRC suits and progress of the field tests
CRC suit No. 4 No. 5 No. 6 No. 7 Visual Remote
SLAM instruction
Technology
1. Lightweight CRC suit
2. Retroactive object search - -
3. Emotion estimation - - - - -
4. Behavior estimation - - - -
5. Trajectory in non-GNSS -
IMU IMU IMU IMU Camera
Offline Offline Offline Offline Online,
Offline
6. Remote instruction - - - - -
Subject
House dogs
JRDA SAR dogs -
Mountain rescue dogs - - - - -
Fields for evaluation
ImPACT test fields -
JRDA training fields -
(JST) Agency.
References
1. Akselrod, S., Gordon, D., Ubel, F.A., Shannon, D.C., Berger, A.C., Cohen, R.J.: Power spectrum
analysis of heart rate fluctuation: a quantitative probe of beat-to-beat cardiovascular control.
Science 213, 220–222 (1981)
2. Appelhans, B.M., Luecken, L.J.: Heart rate variability as an index of regulated emotional
responding. Rev. Gen. Psychol. 10, 229 (2006)
3. Boissy, A., Manteuffel, G., Jensen, M.B., Moe, R.O., Spruijt, B., Keeling, L.J., Winckler, C.,
Forkman, B., Dimitr ov, I., Langbein, J.: Assessment of positive emotions in animals to improve
their welfare. Physiol. Behav. 92, 375–397 (2007)
4. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001). https://doi.org/10.1023/A:
1010933404324
5. Browne, C., Stafford, K., Fordham, R.: The use of scent-detection dogs. Ir. Vet. J. 59, 97 (2006)
6. Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: The 22nd ACM SIGKDD
International Conference on Knowledge Discovery and Data Mining, pp. 785–794 (2016).
https://doi.org/10.1145/2939672.2939785
7. Chiu, W., Arnold, J., Shih, Y., Hsiung, K., Chi, H., Chiu, C., Tsai, W., Huang, W.C.: A survey
of international ur ban search-and-rescue teams following the Ji Ji earthquake. Disasters 26,
85–94 (2002)
8. den Uijl, I., Álvarez, C.B., Bartram, D., Dror, Y., Holland, R., Cook, A.: External valida-
tion of a collar-mounted triaxial accelerometer for second-by-second monitoring of eight
behavioural states in dogs. Plos One 12(11), e0188,481 (2017). https://doi.org/10.1371/journal.
pone.0188481
9. Dreiseitl, S., Ohno-Machado, L.: Logistic regression and artificial neural network classification
models: a methodology review. J. Biomed. Inform. 35(5–6), 352–359 (2002)
10. Ekman, P., Levenson, R.W., Friesen, W.V.: Autonomic nervous system activity distinguishes
among emotions. Science 221, 1208–1210 (1983)
11. ELAN (version 5.2). https://tla.mpi.nl/tools/tla-tools/elan/. Max Planck Institute for Psycholin-
guistics, The Language Archive, Nijmegen, The Netherlands
12. Ferworn, A., Sadeghian, A., Barnum, K., Ostrom, D., Rahnama, H., Woungang, I.: Canine as
robot in directed search. In: Proceedings of IEEE/SMC International Conference on System
of Systems Engineering, Los Angeles, CA, USA (2006)
13. Ferworn, A., Sadeghian, A., Barnum, K., Rahnama, H., Pham, H., Erickson, C., Ostrom, D., L.
Dell’Agnese: Urban: search and rescue with canine augmentation technology. In: Proceedings
of IEEE/SMC International Conference on System of Systems Engineering, Los Angeles, CA,
USA (2006)
14. Ferworn, A., Waismark, B., Scanlan, M.: CAT 360 – Canine augmented technology 360-degree
video system. In: 2015 IEEE International Symposium on Safety, Security, and Rescue Robotics
(SSRR) (2015)
15. Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep
networks. In: Proceedings of the 34th International Conference on Machine Learning, ICML,
pp. 1126–1135 (2017)
16. Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 25,
1189–1232 (2001)
17. Fukushima, K.: Neocognitron: a self-organizing neural network model for a mechanism of
pattern recognition unaffected by shift in position. Biol. Cybern. 36(4), 193–202 (1980)
18. Gerencsér, L., Vásárhelyi, G., Nagy, M., Vicsek, T., Miklósi, A.: Identification of behaviour in
freely moving dogs (Canis familiaris) using inertial sensors. PLoS One 8(10), e77,814 (2013).
https://doi.org/10.1371/journal.pone.0077814
19. Hamada, R., Ohno, K., Matsubara, S., Hoshi, T., Nagasawa, M., Kikusui, T., Kubo, T., Naka-
hara, E., Ikeda, K., Yamaguchi, S.: Real-time emotional state estimation system for Canines
based on heart rate variability. In: CBS, pp. 298–303 (2017)
20. Hearst, M.A., Dumais, S.T., Osuna, E., Platt, J., Scholkopf, B.: Support vector machines. IEEE
Intell. Syst. Appl. 13(4), 18–28 (1998)
21. Inagaki, H., Kuwahara, M., Tsubone, H.: Changes in autonomic control of heart associated
with classical appet itive conditioning in rats. Exp. Anim. 54, 61–69 (2005)
22. Jonathan, M., Ueli, M., Dan C., J’urgen, S.: Stacked Convolutional Auto-Encoders for Hierar-
chical Feature Extraction, Artificial Neural Networks and Machine Learning— ICANN 2011.
Lecture Notes in Computer Science (2011)
23. K9-CameraSystem (2011). http://www.tacticalelectronics.com/products/21/products/55/k-9-
systems/70/k-9-back-mounted-camera (2011). Accessed 31 May 2014
24. Katayama, M., Kubo, T., Mogi, K., Ikeda, K., Nagasawa, M., Kikusui, T.: Heart rate variability
predicts the emotional state in dogs. Behav. Proc. 128, 108–112 (2016)
25. Komori, Y., Fujieda, T., Ohno, K., Suzuki, T., Tadokoro, S.: 1a1-u10 search and rescue dogs’
barking detection from audio and inertial sensor. In: The Proceedings of JSME Annual Con-
ference on Robotics and Mechatronics (ROBOMECH), pp. 1A1- U10_1–1A1- U10_4. The
Japan Society of Mechanical Engineers (2015). https://doi.org/10.1299/jsmermd.2015._1A1-
U10_1
26. Kreibig, S.D.: Autonomic nervous system activity in emotion: a review. Biol. Psychol. 84,
394–421 (2010)
192 K. Ohno et al.
27. Kruijff, G.J.M., Kruijff-Korbayová, I., Keshavdas, S., Larochelle, B., Janíček, M., Colas, F.,
Liu, M., Pomerleau, F., Siegwart, R., Neerincx, M.A., Looije, R., Smets, N.J.J.M, Mioch, T.,
van Diggelen, J., Pirri, F., Gianni, M., Ferri, F., Menna, M., Worst, R., Linder, T., Tretyakov, V.,
Surmann, H., Svoboda, T., Reinštein, M., Zimmermann, K., Petříček, T., Hlaváč, V.: Designing,
developing, and deploying systems to support human—robot teams in disaster response. Adv.
Robot. Taylor & Francis 28(23), 1547–1570 (2014). https://doi.org/10.1080/01691864.2014.
985335
28. Ladha, C., Belshaw, Z., J, O., Asher, L.: A step in the right direction: an open-design pedometer
algorithm for dogs. Bmc. Vet. Res. 14(1), 107 (2018). https://doi.org/10.1186/s12917-018-
1422-3
29. Lane, R.D., McRae, K., Reiman, E.M., Chen, K., Ahern, G.L., Thayer, J.F.: Neural correlates
of heart rate variab ility during emotion. Neuroimage 44, 213–222 (2009)
30. LeCun, Y., Boser, B., Denker, J.S., Howard, R.E., Habbard, W., Jackel, L.D., Henderson, D.:
Handwritten digit recognition with a back-propagation network. Adv. Neural Inf. Process. Syst.
2, 396–404 (1990)
31. LeDoux, J.: Rethinking the emotional brain. Neuron 73, 653–676 (2012)
32. Michael, N., Shen, S., Mohta, K., Mulgaonkar, Y., Kumar, V., Nagatani, K., Okada, Y., Kirib-
ayashi, S., Otake, K., Yoshida, K., Ohno, K., Takeuchi, E., Tadokoro, S.: Collaborative mapping
of an earthquake-damaged building via ground and aerial robots. J. Field Robot 29(4), 832–841
(2012)
33. Murphy, R.: Disaster Robotics. MIT Press, Cambridge (2014)
34. Nagatani, K., Kiribayashi, S., Okada, Y., Otake, K., Yoshida, K., Tadokoro, S., Nishimura, T.,
Yoshida, T., Koyanagi, E., Fukushima, M., Kawatsuma, S.: Emergency response to the nuclear
accident at the fukushima daiichi nuclear power plants using mobile rescue robots. J. Field
Robot. 30(1), 44–63 (2013)
35. Narisada, S., Mashiko, S., Shimizu, S., Ohori, Y., Sugawara, K., Sakuma, S., Sato, I., Ueki, Y.,
Hamada, R., Yamaguchi, S., Hoshi, T., Ohno, K., Yoshinaka, R., Shinohara, A., Tokuyama, T.:
Behavior identification of search and rescue dogs based on inertial sensors. In: The Proceedings
of JSME annual Conference on Robotics and Mechatronics (ROBOMECH). The Japan Society
of Mechanical Engineers (2017). https://doi.org/10.1299/jsmermd.2017.2A1-Q04
36. Ohno, K., Yamaguchi, S., Nishinoma, H., Hoshi, T., Hamada, R., Matsubara, S., Nagasawa,
M., Kikusui, T., Tadokor, S.: Control of Canine’s Moving Direction by Using On-suit Laser
Beams, IEEE CBS (2018)
37. Reefmann, N., Wechsler, B., Gygax, L.: Behavioural and physiological assessment of positive
and negative emot ion in sheep. Anim. Behav. 78, 651–659 (2009)
38. Sakaguchi, N., Ohno, K., Takeuchi, E., Tadokoro, S.: Precise velocity estimation for dog using
its gait. In: Proceedings of The 9th Conference on Field and Service Robotics (2013)
39. Slensky, K.A., Drobatz, K.J., Downend, A.B., Otto, C.M.: Deployment morbidity among
search-and-rescue dogs use d after the September 11, 2001, terrorist attacks. J. Am. Vet. Med.
Assoc. 225, 868–873 (2004)
40. Tran, J., Ferworn, A., Ribeiro, C., Denko, M.: Enhancing canine disaster search. In: Proceedings
of IEEE/SMC International Conference on System of Systems Engineering Monterey, CA,
USA (2008)
41. Tsoumakas, G., Katakis, I., Vlahavas, I.P.: Mining multi-label data. In: Data Mining and Knowl-
edge Discovery Handbook, 2nd ed., pp. 667–685 (2010). https://doi.org/10.1007/978-0-387-
09823-4_34
42. Wagner, J., Kim, J., André E.: From physiological signals to emotions: implementing and
comparing selected methods for feature extraction and classification. In: IEEE/ICME, pp.
940–943 (2005)
43. Wittenburg, P., Brugman, H., Russel, A., Klassmann, A., Sloetjes, H.: ELAN: a professional
framework for multimodality research. In: Proceedings of 5th International Conference on
Language Resources and Evaluation (LREC 2006), pp. 1556–1559 (2006)
44. Yamaguchi, S., Ohno, K., Okada, Y., Suzuki, T., Tadokoro, S.: Sharing of search and rescue
dog’s investigation activities by using cloud services and mobile communication service. In:
The Proceedings of JSME annual Conference on Robotics and Mechatronics (ROBOMECH),
p. 1A1-09a2. The Japan Society of Mechanical Engineers (2016). https://doi.org/10.1299/
jsmermd.2016.1A1-09a2
45. Yamakawa, T., Fujiwara, K., Miyajima, M., Abe, E., Kano, M., Ueda, Y.: Real-time heart rate
variability monitoring em ploying a wearable telemeter and a smartphone. In: APSIPA-ASC,
pp. 1–4 (2014)
46. Yonezawa, K., Miyaki, T., Rekimoto, J.: Cat@Log: sensing device attachable to pet cats for
supporting human-pet interaction. In: Proceedings of International Conference on Advances
in Computer Entertainment Technology, pp. 149–156 (2009)
Chapter 5
Dual-Arm Construction Robot with
Remote-Control Function
Hiroshi Yoshinada, Keita Kurashiki, Daisuke Kondo, Keiji Nagatani, Seiga

Kiribayashi, Masataka Fuchida, Masayuki Tanaka, Atsushi Yamashita,
Hajime Asama, Takashi Shibata, Masatoshi Okutomi, Yoko Sasaki,
Yasuyoshi Yokokohji, Masashi Konyo, Hikaru Nagano, Fumio Kanehiro,
Tomomichi Sugihara, Genya Ishigami, Shingo Ozaki, Koich Suzumori, Toru
Ide, Akina Yamamoto, Kiyohiro Hioki, Takeo Oomichi, Satoshi Ashizawa,
Kenjiro Tadakuma, Toshi Takamori, Tetsuya Kimura, Robin R. Murphy and
Satoshi Tadokoro
Abstract In disaster areas, operating heavy construction equipment remotely and

autonomously is necessary, but conventional remote-controlled heavy equipment has
problems such as insufficient operability, limited mobility on slopes and stairs, and
H.Yoshinada (B) · K. Kurashiki · D. Kondo · T. Sugihara

Osaka University, Osaka, Japan
e-mail: yoshinada@jrl.eng.osaka-u.ac.jp
K. Kurashiki
e-mail: kurashiki@jrl.eng.osaka-u.ac.jp
D. Kondo
e-mail: kondo@jrl.eng.osaka-u.ac.jp
T. Sugihara
e-mail: zhidao@ieee.org
K. Nagatani · S. Kiribayashi · M. Konyo · H. Nagano · K. Tadakuma · S. Tadokoro
e-mail: keiji@ieee.org
S. Kiribayashi
e-mail: seigakiribayashi@gmail.com
M. Konyo
e-mail: konyo@rm.is.tohoku.ac.jp
H. Nagano
e-mail: nagano@rm.is.tohoku.ac.jp
K. Tadakuma
e-mail: tadakuma@rm.is.tohoku.ac.jp
S. Tadokoro
M. Fuchida · A. Yamashita
The University of Tokyo, Tokyo, Japan
e-mail: fuchida@robot.t.u-tokyo.ac.jp
https://doi.org/10.1007/978-3-030-05321-5_5
196 H. Yoshinada et al.
low work efficiency because of difficult remote control. As part of the ImPACT-
TRC Program, a group of Japanese researchers attempts to solve these problems by
developing a construction robot for disaster relief tasks with a new mechanism and
new control methods. This chapter presents the overview of construction robot and the
details of main elemental technologies making up the robot. Section 5.1 describes the
basic configuration of the robot and the teleoperation system. Section 5.2 is a tether
powered drone which provides extra visual information. Sections 5.4 and 5.3 are
force and tactile feedback for skillful teleoperation. Section 5.5 is visual information
feedback which consists of an arbitrary viewpoint visualization system and a visible
and LWIR camera system to observe surrounding of the robot in a dark night scene
and/or a very foggy scene. These functions can dramatically increase construction
equipment’s capacity to deal with large-scale disasters and accidents.
A. Yamashita
e-mail: yamashita@robot.t.u-tokyo.ac.jp
M. Tanaka
Advanced Industrial Science and Technology (AIST)/Tokyo Institute of Technology,
Tokyo, Japan
e-mail: mtanaka@sc.e.titech.ac.jp
H. Asama
University of Tokyo, Tokyo, Japan
e-mail: asama@robot.t.u-tokyo.ac.jp
T. Shibata
NEC Corporation, Tokyo, Japan
e-mail: t-shibata@hw.jp.nec.com
M. Okutomi · K. Suzumori · T. Ide · A. Yamamoto
Tokyo Institute of Technology, Tokyo, Japan
e-mail: mxo@sc.e.titech.ac.jp
K. Suzumori
e-mail: suzumori@mes.titech.ac.jp
T. Ide
e-mail: ide.t.ad@m.titech.ac.jp
A. Yamamoto
e-mail: yamamoto.a.ag@m.titech.ac.jp
Y. Sasaki · F. Kanehiro
Advanced Industrial Science and Technology (AIST), Tokyo, Japan
e-mail: y-sasaki@aist.go.jp
F. Kanehiro
e-mail: f-kanehiro@aist.go.jp
5 Dual-Arm Construction Robot with Remote-Control Function 197
5.1 Overview of Construction Robot
5.1.1 Introduction
Construction machinery is often used for disaster response work in case of earth-
quakes, landslides, etc. Among these machines, the hydraulic excavator (Fig. 5.1)
has been playing a central role in disaster sites because of the traveling performance
using its crawlers and its multifunctional workability enabled by its multi-joint arms.
The hydraulic excavator is a construction machine for excavating and loading earth
and sand, but attaching various end effectors to it allows it to do cutting and handling
processes as well. Moreover, the collaborative use of its traveling mechanism and
work machine arm allows it to go beyond large steps and grooves and escape from
muddy ground, such as liquefied soil. These functions of hydraulic excavators are
demonstrated effectively in the relief work in disaster areas.
A hydraulic excavator is a machine for excavating the ground with immense force.
Delicately controlling force is not its strong suit. Moreover, its body is difficult to
stabilize on scaffolds with uneven surfaces, such as on top of a rubble, because the
lower part of its traveling mechanism is a fixed-type crawler without a suspension.
Y. Yokokohji
Kobe University, Kobe, Japan
e-mail: yokokohji@mech.kobe-u.ac.jp
G. Ishigami
Keio University, Tokyo, Japan
e-mail: ishigami@mech.keio.ac.jp
S.Ozaki
Yokohama National University, Yokohama, Japan
e-mail: s-ozaki@ynu.ac.jp
K. Hioki
JPN CO. LTD., Ota, Japan
e-mail: hioki@j-p-n.cojp
T. Oomichi · S. Ashizawa
Meijyo University, Meijyo, Japan
e-mail: oomichi@meijo-u.ac.jp
S. Ashizawa
e-mail: ashizawa@meijo-u.ac.jp
T. Takamori
International Rescue System Institute (IRS), Kobe, Japan
e-mail: takamori@rescuesystem.org
T. Kimura
Nagaoka University of Technology, Nagaoka, Japan
e-mail: kimura@mech.nagaokaut.ac.jp
R. R. Murphy
Texas A&M University, Texas, USA
e-mail: robin.r.murphy@tamu.edu
Fig. 5.1 Hydraulic

excavator. A hydraulic
excavator has been playing a
central role in disaster sites.
But it is not used when there
are possible victims under
collapsed buildings or earth
and sand
Therefore, the machine might behave unexpectedly while in use. For this reason,
hydraulic excavators are not often used during the early stages of disasters when
there are possible victims under collapsed buildings or earth and sand.
The hydraulic excavator is a human-operated machine; hence, strong non-linear
characteristics are imparted to its work machine’s driving system to match the oper-
ator’s maneuvering senses. Furthermore, restrictions on the equipment used causes
large hysteresis and a lag time of approximately 0.1–0.2 s. Therefore, various control
laws, particularly servo control weaving, are not easy, making automation difficult
to achieve.
Remote operation would be especially favorable during a disaster response
because it would be easier to predict situations when the machine operator is also
at risk. A remote-controlled device is available as an option on the hydraulic exca-
vator; however, most need to be operated with a direct view on a distance of within
100 m, which is insufficient for disaster response activities. For a long-distance,
remote-controlled operation using image transmission, an unmanned construction
system is used for soil erosion control work in areas like Unzen Fugendake [5]. This
unmanned construction system is limited to relatively routine work, and a number of
vehicles with cameras must be arranged around the hydraulic excavator to improve
workability. The situations where it can be used are still limited.
The goal of the Construction Robots of the ImPACT Tough Robotics Challenge
is to implement a disaster response construction robot with dramatically improved
motion control characteristics and improved remote controllability by reviewing the
current hydraulic excavator mechanism, hydraulics, control, and operation system
from their very origins.
5.1.2 Construction Robots of the ImPACT Tough Robotics

Challenge
The Construction Robots of the ImPACT Tough Robotics Challenge aim to imple-
ment the following:
Goal 1: A machine with a strong, delicate, and dexterous workability
Goal 2: A machine with high ground adaptability
Goal 3: A machine with flexible remote controllability
To achieve the above-mentioned goals, this research and development project is
promoting the development of the following new mechanisms and systems:
5.1.2.1 Double-Swing Dual-Arm Mechanism
Humans do most of their work using both arms, as with eating using a knife and a
fork. Giving the robot two arms enables it to have a dramatically greater freedom
to work than the robot with a single arm. The two arms are generally configured
to the image of human arms, with the right and left arm on either side of the body.
Almost all multi-arm robots that have been developed so far have this composition.
However, in a mechanical system, the basis for configuration may not necessarily
need to be humans. Other possibilities can be explored. The robots being referred to
in this research adopt the “double-swing dual-arm mechanism.” This configuration
has the two arms growing from the waist. Just like when humans use the force from
their waist when carrying heavy loads, the robot’s ability to respond to heavy load is
significantly increased by having arms at its waist. The concept of a right arm/left arm
is also abandoned in favor of adopting the “upper arm” and “lower arm” composition.
In this configuration, the two arms are placed one on top of the other, with both being
able to endlessly rotate for 360◦ . In this way, the layout of both arms can be set
freely. Moreover, whether the robot is facing front or back has no distinction. This
is a mechanism unique to robots and impossible for living things.
In the double-swing dual-arm mechanism (Fig. 5.2), the pivot parts of the shoulder
of the left and right arms are coaxially overlapped. Compared with the mechanism,
where the shoulder joints are fixed on separate axes, the bearings of a much larger
diameter can be used herein. In addition, both arms are supported near the robot’s
center of gravity; hence, it has a high stability feature. Through this mechanism, this
robot is highly adaptable to carrying large loads, and its structure makes it suitable for
heavy work. Each arm is coaxially arranged and rotates 360◦ ; thus, their orientation
can be freely changed. The robot can move with a crawler while supporting the ground
with an arm that can turn freely. It can adapt highly to harsh environments at disaster
sites. For example, on a steep slope or a site with intense surface irregularities, the
robot can stabilize itself by grasping standing trees or fixed objects on the ground
with one of its arms, while the other arm performs handling work. In addition, the
robot can go beyond a large step (Fig. 5.3) by operating the arm and the crawler
Fig. 5.2 Double-swing mechanism. The pivot parts of the shoulder of the left and right arms are
coaxially overlapped
Fig. 5.3 Various usage of dual-arm. On a steep slope or a site with intense surface irregularities,
the robot can stabilize itself by grasping standing fixed objects on the ground with one of its arms
together. Both arms can be turned in the same direction to do dual-arm cooperative
work. In this case, the orientation of the left and right arms can be changed by taking
advantage of their double turning feature. Thus, when different functions are given
to either arm, such as a cutter on one arm and a gripper on the other, appropriate
functions can be assigned to each arm depending on the work situation, and the robot
can respond accordingly to the diverse and complicated tasks in disaster sites.
However, even if there are two arms, unless they can both be freely controlled,
the arms will more likely only get in each other’s way and end up causing more
disadvantages. To solve this problem, the construction robot herein was developed
with a method to appropriately control the pressure applied to the cylinder at high
speed together with the target value control of position and speed, which makes
controlling a highly stable, highly responsive large inertial work machine possible
without generating a large overshoot or oscillation. Together with the increase of
the responsiveness of the hydraulic system by one magnitude over the conventional
construction machine, high motion features and force controllability are implemented
into the work machine by significantly improving the frictional characteristic of the
actuators.
5.1.2.2 Acquisition of Information on the Robot’s Surroundings

and a New Operation Method
In improving workability during a remote operation, reproducing information on a

remote place as close to the operator as possible is preferable. A study on Telexis-
tence [47], which is representative of this, has been presented; however, transportation
and installation of it to a site immediately after a disaster are no easy tasks because it
is a complicated system. This project aims to implement a remote-controlled cockpit
that is highly portable. For this, the construction of an operating system that uses
only images and haptic/tactile display is being considered. Among these conditions,
research and development is being done to elements like tether-powered supply
drones to minimize information loss during remote control operation and enable
the robot to grasp surrounding situations, bird’s-eye view image generation at arbi-
trary perspectives, transmission view over smoke and fog, generation of semi-hidden
images, and sound source searching. Moreover, a new operation system with supe-
rior intuition and an operation training system using the robot’s dynamic simulator
are being developed to maximize the usage of the multi-arm robot’s many features.
With these, the goal is to dramatically improve the current decreasing trend in work
efficiency of existing remote-control construction machinery.
Figure 5.4 shows the technology for creating a construction robot. Figure 5.5
depicts the image of the construction robot being utilized in a disaster site.
5.1.3 Construction Robot
This section provides an outline of the construction robot.
5.1.3.1 Construction Robot
Figure 5.6 shows the outer appearance of a construction robot with a double-swing
dual-arm mechanism. The robot weighs 2.5 t, and has an overall width of 1450 mm
and an overall height of 1900 mm. The crawler has a total length of 1880 mm. It is
powered using a diesel engine with an output of 16 kw. Both arms can manage heavy
loads, but each arm has also been given its own unique feature, allowing for other
types of heavy work. The upper arm can be used for delicate and dexterous work,
such as those of a manipulator, while the lower arm can perform digging work like
a hydraulic excavator.
Fig. 5.4 Technological components of the construction robot
Fig. 5.5 Conceptual sketch of the construction robot in a disaster site

Fig. 5.6 Construction robot using double-swing dual-arm mechanism. The robot has 21DoF
The upper arm has a 7 DoF + gripper configuration. A highly controllable low-
friction element developed by this project is being used in the actuator of each joint.
It has a haptic and tactile display function for the operator. The haptic function
estimates the hand load force with high accuracy [1], while the force-feedback-
type bilateral control transmits work reaction force to the operator. The gripper’s
grip force is sent back to the operator by the same bilateral control. The tactile
function uses a vibration pickup attached to the wrist of the robot arm, which detects
tactile information as a vibration signal. The vibration signals picked up consist of
various vibrations. Within these, those related to contact are extracted in real time
and transmitted to the operator [34]. The operator sees the tactile information through
the tactile display attached to the wristband. In addition, the adopted system does not
incorporate sensors (for both haptic and tactile) to the robot’s fingertips, making it
suitable for heavy-duty machines, such as this robot. The lower arm has a 7 DoF tactile
display function, and no haptic display is currently provided; however, incorporating
this in the robot is not difficult. A four-fingered tough robot hand developed in this
project is attached at the tip of this robot’s lower arm [17]. This hand, which is
powered by six hydraulic cylinders and two hydraulic motors, can be switched to
“bucket mode,” to excavate earth, sand, etc. and “hand mode” to handle objects with
its four fingers based on the shape of the object by operating the robot’s actuators to
change the hand shape. With the hand mode, the target can be grasped based on its
shape, and the grip strength can be adjusted within the 14–3000 N range.
5.1.3.2 Displaying Information on the Robot’s Surroundings
In the remote construction of construction equipment, such as those in unmanned

construction, devices, such as dedicated camera vehicles, are being made to acquire
image information on a third-party perspective because obtaining sufficient image
information using only the cameras on the main body of the machine is difficult.
However, during disaster response activities, immediately after the event, position-
ing the camera vehicles freely around the machine is difficult; hence, another method
must be used to acquire the information through third-party perspective. The con-
struction robot in this project is equipped with the technology to display information
on the robot’s surroundings.
A multi-rotor aerial vehicle (drone) is mounted on the helipad installed on this
robot. Images can be acquired on a third-party perspective through the camera
installed in this drone. The drone’s robot body employs a wired power supply,
enabling it to fly for a long time [20]. Moreover, the tension and the direction of
the feed cable make it possible to acquire the relative position of the drone without
the use of a GPS. Furthermore, winding the cable while landing the drone helps
to make pinpoint landing possible. Another technology to display information sur-
rounding the robot is the system equipped in the robot, which constructs a bird’s eye
view image from the images provided by the multiple fisheye cameras installed in
the robot [46]. This method of combining images from multiple cameras attached to
a vehicle to acquire a bird’s eye view image is adopted for cars, and is known to facil-
itate driving. However, in a car, the perspective of the bird’s eye view image is fixed
at a point, and cannot be changed. This is sufficient in simple operations like street
parking, but in complicated circumstances like in a disaster site, traveling safely and
reliably with a bird’s eye view image from only a single direction is difficult. This
project develops a new image processing algorithm, where four images from fisheye
cameras mounted on the robot are synthesized, making it possible to display bird’s
eye view images from arbitrary perspectives to the operator in real time. Using these
systems makes it possible to easily grasp the situation around the robot even in a
complicated environment like a disaster site, which enables the usage of the robot
through a remote control in a tough outdoor environment.
In a disaster site, the view may be obstructed by smoke caused by fire or similar
elements. For this, the robot is equipped with an extreme infrared camera that makes it
possible to grasp the surroundings even during instances when the visible light camera
is ineffective [35]. Extreme infrared cameras generally have a narrow viewing angle,
but this project developed a panoramic extreme infrared camera that can acquire a
wide range of view (Fig. 5.7).
The omnidirectional camera is mounted on the robot body (Fig. 5.8), and through
this, the operator can cut out the image piece he/she wishes to see from the acquired
image and display it. Software processing makes it possible to pan, tilt, or zoom
Fig. 5.7 Visual sensing devices. The robot is equipped with synthetic bird’s-eye view cameras, a
FIR panorama camera and a tether powered drone camera
images. In addition, a camera is built within the robot’s hands, which is useful in
grasping the positional relationship between the hand and the target object. Recording
the other arm with this camera enables observation of the situation of the arm and
the target object (Fig. 5.9).
5.1.3.3 Remote Control System
The construction robot has acquired various image data; hence, these must be dis-
played to the operator in an easy-to-understand manner. Moreover, the image data
should be reproduced in a wide viewing space to improve workability through the
remote control, which makes it seem as if the operator is in the actual site. All
equipment must also be easy to transport and install in disaster sites.
A multi-monitor system is generally used when displaying a number of images.
However, in this system, the size and the layout of the screen are fixed, and visibility
is not high. In addition, multiple monitors need to be used to display a panoramic
image, and the seams in each of the screens obstruct the view.
To seamlessly display panoramic images, immersive screens projected onto
curved surfaces would be more effective. The image displayed through the con-
struction robot is predominantly a view from a distance of several meters or more;
hence, when the screen is too close, it becomes too unnatural and causes eye strain.
To eliminate this issue, the viewing distance from the screen needs to be 1–1.5 m or
more, and to obtain a wide field of view, the image size must be increased. Wide
viewing screens are generally spherical and often made from molded parts of plas-
tic injection or deformed metals and such, which are mostly heavy and cannot be
Fig. 5.8 Omnidirectional

camera. The operator can cut
out the image piece he/she
wishes to see from the
acquired image and display it
Fig. 5.9 Hand cameras. Two

small cameras (Front and
Side) are installed in a hand
folded; hence, their difficulty to transport is an issue. Ambient noise that vibrates off
of the screen of the spherical surface can be heard by the robot operator and cause
discomfort.
To solve these problems, this project is developing an image presentation system
based on new ideas [24]. The image display system of the construction robot adopts
Fig. 5.10 Assembly diagram of the half-cylindrical screen
Fig. 5.11 Field of view. The

screen is inclined inwards in
the operator’s perspective.
This is suitable for human
physiological reflex [12], and
the sound that vibrates off of
the screen escapes upwards,
this is expected to minimize
the operator’s discomfort
the method of elastically deforming a flat plate of flexible plastic into a cylindrical
shape and fixing it to frame to make a screen. Figure 5.10 shows the process of
creating this screen. Both ends of the flat plate are elastically deformed into a half-
cylindrical shape and affixed to the frame with metal fittings and bolt fasteners to form
a screen. Affixing the plate to the frames curves it against the cylindrical surface.
The screen was cut from a white polycarbonate plate, which is 3 mm thick, and
subjected to a very fine blast treatment on the surface, forming a projection screen.
As shown in Fig. 5.11, the screen is inclined inwards in the operator’s perspective.
The line of sight shifts from bottom to top because of the inclination of the screen,
and the increasing focal length is almost maintained. This is suitable for human
physiological reflex [13], and is considered difficult to cause eye strain. The screen
is a tilted cylindrical surface; thus, the sound that vibrates off of the screen escapes
upwards, and this is expected to minimize the operator’s discomfort.
Fig. 5.12 Projection to the

screen. The images are
projected using four
projectors. They are folded
back through a surface
mirror to suppress the height
of the device
The images are projected using four projectors. The images are projected from the
upward direction such that the robot operator’s shadow does not form. The image is
folded back through a surface mirror to suppress the height of the device (Fig. 5.12).
The screen body is lightweight and pliable; therefore, it can be made round. The
frames are divided into parts. Furthermore, the surface mirror also uses a lightweight
film mirror. This system makes transport, assembly, and installation on the disaster
site easy.
As an operation interface (Fig. 5.13), one 7 DoF haptic device each is provided to
operate the upper and lower arms. Four foot pedals for running and swing operations
are installed. The robot’s work machine is operated based on a different master-slave
configuration and incorporates the force-feedback-type bilateral control. A function
that turns the master lever into a three-dimensional mouse is added, making it possible
to utilize the entire operation area of the work machine. The scale ratio can also be
changed from the master operation to the slave operation depending on the work
content. Switch boxes are provided on the left and right of the operator. A lever to
operate the blade is installed in the switch box on the right. Other switches are for
activating/stopping the robot and several mode changes. Except for blade operation,
the operator can freely manipulate the robot in all degrees without releasing his hand
from the master lever. Dozing work by blades is normally done during travel and
blade operations, causing no issues because the arms are not moved.
Figure 5.14 shows the state of robot operation using the developed system I/F.
Displaying multiple images with arbitrary sizes and angles becomes possible, and
through this, visibility improves for the operator, and his work becomes more effi-
cient. Note that switching between images and changing the camera’s perspective
are done by a sub-operator.
Fig. 5.13 Operation interface. Two 7 DoF haptic devices for operating the upper and lower arms.
Four foot pedals for running and swing operations. The robot’s work machine is operated based on
a different master-slave configuration with the force-feedback-type bilateral control
Fig. 5.14 Operation with the system interface. Displaying multiple images with arbitrary sizes and
angles is possible
The robot and remote control cockpit communicate with each other wirelessly
using a combination of dedicated and general-purpose wireless devices, such as 5.7
GHz band, 5.0 GHz band, 2.4 GHz band, etc.
5.1.4 Field Experiment
Evaluation experiments have been conducted several times in a test field that simu-
lates a disaster site. Within this period, improvements to the robot have been made.
Some of the evaluation experiments are described below.
Figure 5.15 shows the road clearing work done using the multi-arm. The scenario
was: there is rubble on the road, where the rescue team is heading; road clearing work
is to be done to remove the rubble. The two arms were first manipulated together by
having the upper end of the iron pipe in the rubble gripped by the upper arm, while
the lower end of the iron pipe was cut by the lower arm with the cutter so that it can
be removed. Next, the rubble needs to be removed so that the road can be cleared.
Two iron pipes can be found on the left and right. The position of the cutter needs to
shift from the left to the right to cut the lower end of both pipes. This process was
achieved by changing the orientation of the arms through the use of the double-swing
dual-arm mechanism.
The scenario in Fig. 5.16 was: search for survivors in a collapsed house by working
to secure an accessible path for the rescue team. The collapsed roof is to be peeled off,
and the interior of the collapsed house is to be searched using the built-in camera in
the robot’s hand. Survivors must be found, and an entry path must be secured for the
rescue team. First, the earth and sand on the roof were removed without destroying
the roof. The roof was then lifted off with one hand, and the jack was inserted under
the roof with the other hand. This series of operations was done fully with the remote
control.
In addition, a typical peg-in-hole task was done to evaluate the robot’s ability to
perform delicate and dexterous work (Fig. 5.17). The pin diameter was 60 mm, and
the hole clearance was 200 µm. Both pin and hole edges were not chamfered. In an
experiment conducted separately, the robot was able to perform the fitting even with
a hole clearance of 50 µm.
5.1.5 Conclusion
One of the goals of the ImPACT Tough Robotics Challenge Construction Robot
development project, which is “a machine that is strong, and can do delicate and
dexterous work,” has been achieved to some degree. This robot is going through
the process of system integration, and its development is making progress. In the
future, through field evaluation experiments, the plan is to continue research and
development toward achieving the final goal.
Fig. 5.15 Road cleaning work using the multi-arm
Fig. 5.16 Searching survivors in a collapsed house

Fig. 5.17 Peg-in-hole task. The pin diameter was 60 mm, and the hole clearance was 200 µm
5.2 Tether Powered Multirotor Micro Unmanned

Aerial Vehicle
5.2.1 Introduction
Typical unmanned remote-controlled construction machines, called construction

robots, in disaster sites use on-vehicle cameras and external cameras to obtain envi-
ronmental information. However, in the initial phase (within one month) of typical
emergency restoration work after natural disasters, it is difficult and dangerous to
install external cameras for construction robots. Therefore, the operator is forced
to operate the construction robot remotely with limited visual information obtained
from cameras mounted on it. Unfortunately, the on-vehicle cameras are inadequate
for controlling construction robots. To overcome this challenge, an idea was proposed
to use a multirotor micro unmanned aerial vehicle (MUAV) as a camera carrier to
obtain extra visual information. A tether powered MUAV and a helipad system,
developed in this project, have the following advantages:
1. Within the tether length, the MUAV can remain at arbitrary positions to obtain
alternate viewpoints.
2. The flight time of the tether powered MUAV is much longer than for a typical
MUAV because a power cable is used as a tether to supply electricity from the
helipad.
Tether powered micro

unmanned aerial vehicle
Power feeding tether
Helipad
Dual-arm
construction robot
Fig. 5.18 Tether powered MUAV and helipad installed on a dual-arm construction robot
3. The MUAV flies only within its tether length. Thus, even if the MUAV is out of
control, it is safe because it will not affect the out of the flight area.
4. By rewinding the tether, reliable pinpoint landings of the MUAV can be achieved.
Studies of tether powered multirotor MUAV have been performed [6, 40], and
some multirotor MUAVs have become commercially available [8]. The tether pow-
ered multirotor MUAV system on the construction robot developed in this project,
shown in Fig. 5.18, has three major differences from conventional tether powered
multirotor MUAVs.
The first difference is the position estimation function for an MUAV. In case of a
disaster, there is no guarantee that the operator can maintain a direct visual position to
control the MUAV. Therefore, the UAV needs to have autonomy of flight, takeoff, and
landing, and the position estimation function is required to ensure MUAV autonomy.
Generally, a global navigation satellite system (GNSS) is used for a typical MUAV.
However, the position accuracy gets worse when cliffs, large trees, or bridges are
nearby. To solve the problem, the developed system has a novel position estimation
system that uses a tether state estimation.
The second difference is horizontal flight. The typical objective of tether powered
multirotor MUAV in general use is vertical fixed-point observation, and the MUAV
does not have to move dynamically. However, for remote control of a construction
robot, the viewpoint of the MUAV has to be changed based on the task, horizontally.
In case of low altitude flight of the MUAV, the power feeding tether may hang down
and be caught by an object in the environment. To avoid such situations, the developed
system has fine and appropriate tension control of the tether.
The third difference is a robustness against vibration and inclination of the con-
struction robot. As the power of a general construction machine is supplied by an
engine, it vibrates. Furthermore, the target environment is natural uneven terrain,
and the robot may be inclined, and a conventional tether-tension-control system may
not work well in such a situation. Therefore, the developed helipad mounted on the
construction robot has a tether-tension-control winch that uses a powder clutch to
realize a robustness against vibration and inclination.
In the following subsections, a position estimation method for an MUAV, the
development of the tether powered MUAV and helipad system, and outdoor experi-
ments are introduced.
5.2.2 Position Estimation of the Tether Powered MUAV
To realize the autonomous motion of an MUAV, position estimation is essential.

Particularly, the relative position of the MUAV from the construction robot is impor-
tant because the MUAV needs to maintain its distance from the construction robot
according to the robot’s motion.
General autonomous flight for MUAV uses GNSS for position estimation. How-
ever, the position accuracy gets worse when cliffs, large trees, or bridges are nearby.
In particular, such situations occur in natural disasters (e.g., Landslide due to the
2016 Kumamoto Earthquake).
In another approach for position estimation of an MUAV, simultaneous localiza-
tion and mapping (SLAM) methods with a laser range sensor [4, 43] and SLAM
methods with a monocular camera [45] were proposed. The latter method is promis-
ing because it can use a lightweight and inexpensive camera. For example, DJI devel-
oped an image tracking method to follow a moving target [10]. These vision-based
approaches seem to be applicable for position estimation. However, in this project,
the target environment is a natural field. It may not be robust enough for image
processing in direct sunlight, in situations where raindrops and dust are present, or
when running water is on the ground. For this reason, the vision-based approach is
excluded from this project.
According to the above situations, a position estimation method that uses a tether
state estimation is proposed in this project. If the tether is lightweight and tightened,
position estimation of the tether powered MUAV is relatively easy. However, when
the tether is used as a power supply, its weight is not negligible. Furthermore, addi-
tional tether tension directly increases the payload of the MUAV. In consideration
of controllability, it is desirable to fly the MUAV with as little tension as possible.
Therefore, in this project, the position estimation method for the MUAV is based on
Fig. 5.19 Schematic of

catenary curve of the tether
between MUAV and helipad.
Source from [22]
the observation of a slack tether with a low tether tension. When both endpoints of a
string-like object are fixed at any two points, the object shapes a catenary curve. In
the proposed system, it is assumed that the tether between the MUAV and the helipad
shapes a catenary curve. Thus, here, the goal of the position estimation of the MUAV
is to calculate the locations of the MUAV and the helipad on the catenary curve,
as shown in Fig. 5.19. A brief description of the method is given in the following
section, and details are described in [22].
The catenary curve, whose origin is a vertex on the x − z plane, is expressed by
hyperbolic functions, as shown in the following equation:

x e a + e− a
x x
z = a cosh −a =a − a, (5.1)
a 2
where a denotes the catenary number, and it is known that a = k/W g. The tether
tension at the vertex is denoted by k, the gravitational acceleration is g, and the line
density is W . To obtain x, Eq. 5.2 can be derived by differentiating Eq. 5.1, and it can
be solved for x. When a is known and the curve slope ddzx at an arbitrary point on the
curve is obtained, the position x can be calculated.
⎛ ⎞
2
dz dz
x = a ln ⎝ + + 1⎠. (5.2)
dx dx
The tension vector T at any arbitrary point on the curve corresponds to the slope
angle θ of the curve. The horizontal component of T corresponds with the tension
k at the vertex of the catenary curve. Therefore, the following equation using θ is
derived:
T cos θ = k. (5.3)
The vertical component of T corresponds to the weight of the cable. Therefore,

the following equation is obtained:
T sin θ = W gs, (5.4)
where s denotes the arc length from the origin to the point.
According to the above equations, a point location (x, z) on the curve can be
calculated by knowing a and θ . This means the point location can be obtained by the
line density W and the tension vector T .
Furthermore, when the curve length between point A and point B on the curve
is obtained, the vertical component of T is calculated at point B from Eq. 5.4. The
horizontal component of T is constant on the curve. Therefore, T is fixed, and the
location of point B is also calculated based on the above Eqs. 5.2–5.4.
Measurement of the tether tension T can be performed at the helipad. Thus, the
location of the helipad on the catenary curve is calculated by measuring T . The
MUAV location on the catenary curve is then calculated by the measured tether
length S. Note that T is a three-dimensional vector, and the position of the MUAV
is calculated three-dimensionally.
5.2.3 Development of Tether Powered MUAV and Helipad
To realize autonomous flight of a tether powered MUAV for remote control of a

construction robot, an MUAV and helipad were developed. Figure 5.20 shows an
overview of the developed helipad and the MUAV. In this subsection, some sub-
systems of the MUAV and helipad are introduced.
5.2.3.1 Development of the Tether-Tension-Control Winch
To enable appropriate control of the tether tension, and to obtain the tether tension
T and tether length S, a tether-tension-control winch was developed and located on
the helipad.
Conventional tether-tension-control uses the feedback control by a tension mea-
surement of the tether. Such a tension measurement is typically conducted by mea-
suring the displacement of a movable pulley to which a spring is connected. However,
when an acceleration is applied to the measurement device, it measures the total of
the tether tension and the acceleration. Furthermore, when the helipad is in an incli-
nation condition, gravitational acceleration is affected, and additional measurement
errors occur. Therefore, the typical feedback control method was difficult to apply
to the target condition of the project.
Tether for power supply

and communication
Measurement device of
tether outlet direction
Tether-tension-control
winch
Fig. 5.20 An overview of the tether powered MUAV and helipad
Fig. 5.21 CAD image of the tether-tension-control winch, source from [21]
To solve the problem mentioned above, a powder clutch that can specify arbitrary
torque with open loop control was chosen, instead of a tether-tension-measurement.
Figure 5.21 shows a CAD model of the winch that includes a powder clutch. The
clutch utilizes magnetic powder, and it transmits torque from the motor to the spool
Fig. 5.22 Measurement

Device of tether outlet
direction
Pitch
Yaw
according to the applied current. Once the torque control of the spool is realized,
the tension of the tether is calculated by the spool torque and the spool radius. To
estimate the tether tension accurately, estimation of the spool radius, which changes
according to the extended tether length, is very important. Therefore, in this project, a
mechanism to wind up the tether densely was developed to estimate the spool radius
precisely. To realize the mechanism, an asynchronous guide roller was installed. The
guide roller, driven by a smart motor, moves in synchronism with the rotation of
the spool, and the spool winds the tether densely. With this mechanism, the helipad
can accurately generate arbitrary tension in the tether at any time, even under the
condition of vibration and inclination of the robot. The spool winding mechanism
also contributes to accurate measurement of the tether length.
5.2.3.2 Development of a Measurement Device for the Tether

Outlet Direction
To measure the outlet direction of the tether to obtain the tether tension vector T , a
device to measure the tether outlet direction was developed. A CAD model of the
device is shown in Fig. 5.22.
It consists of a turntable (yaw angle) with a vertical moving arm (pitch angle).
The arm moves from 0 to 180◦ , and the turntable rotates infinitely. To reduce friction
in the tether, two large pulleys are installed at the root of the arm. As the tether runs
through the device, the tether outlet direction is obtained by the arm direction.
Low-friction potentiometers are attached to the arm and the turntable to measure
the pitch and yaw angles of the arm. Based on the angles, the tether outlet direction
is calculated. Based on the measurement of the tether outlet direction and tether
tension, the tension vector T can be obtained, and position estimation is enabled.
Stepdown
converters
converter
camera
gimbals
Fig. 5.23 Bottom view of the MUAV with two step-down converters
If the arm moves straight upward, it becomes a singular point, and the turntable
does not move. Practically, it is difficult for the MUAV to move strictly from directly
above.
5.2.3.3 Development of the Power System
To realize flight of an MUAV, a power system is very important because flight requires
a large electric power source. For example, the MUAV in the project (Quad-rotor
MUAV with 15-inch propellers, about 2.5 kg) requires 400 W for hovering and 800
W for moving or dealing with disturbances. To reduce loss in the electric resistance
on the power feeding tether, a high voltage and low current power supply system is
configured.
For the multirotor MUAV, a voltage step-down converter is used that allows direct
current (DC) input between 200 and 420 V and a DC output of 24 V. Continuous
output of 600 W is possible for the converter. In this MUAV, to earn at least 800 W,
two converters in parallel are used to obtain an output of 1200 W. With this output,
all functions of the MUAV, including rotation of multiple motors, are covered. The
weight of each module is 160 g, and the total weight of the two converters is lighter
than the weight of batteries normally used for its flight. The voltage step-down
converter is installed at the opposite side of the camera gimbals to balance the center
of gravity of the MUAV, and cooling the converter is performed by a downstream
flow of rotors. A bottom view of the MUAV is shown in Fig. 5.23.
For the helipad, a commercial power source has been typically used to handle
large power consumption in past cases. However, in a disaster environment, it may
be impossible to use such a commercial power source. Therefore, in this project,

a series connection of batteries is adapted for high voltage generation. Maxell’s
lithium-ion battery packs (23.1 V, 127 Wh, 62KSP545483-2) were chosen for the
system. Each pack includes a circuit for estimating the remaining battery level. In
this project, twelve battery packs were connected in series and used as a large battery
unit (277.2 V, 1,524 Wh). The unit enabled over 3 hours operation of hovering for
the multirotor MUAV.
5.2.3.4 Development of the Communication System
In the proposed system, control PCs are located on both the MUAV and the helipad,
and both PCs should communicate with each other. On the other hand, it is necessary
to establish wireless communication between the operation room and the construction
robot to remotely control it. To secure the wireless bandwidth, it was decided that
communication between the MUAV and the helipad would not be wireless.
As the weight of the wires for communication affects the payload of the MUAV
considerably, a VDSL communication system that can be realized with only two
communication lines was chosen. VDSL modems were mounted on both the multi-
rotor MUAV and helipad. The control signals for flight and the camera gimbals were
sent from the helipad to the MUAV through the VDSL communication system.
Once all the control signals were gathered in the control PC on the helipad, the
PC communicated with the operator’s PC via the wireless LAN using heavy and
powerful wireless communication devices on the construction robot. Based on the
proposed communication system, various external communication devices could be
used, and integration with other systems on the construction robot became easier.
5.2.4 Experiments
To confirm validity of the proposed system, several indoor/outdoor experiments were

conducted from 2016 to 2018. One indoor position estimation experiment of the
MUAV evaluated its positioning accuracy, and is reported in [22]. Furthermore, an
outdoor experiment with a dual-arm construction robot validated feasibility of the
system, and is reported in [21]. In this subsection, two recent outdoor experiments
with a remote-controlled excavator, belonging to the Public Works Research Institute
(PWRI) in Japan, are introduced.
First, an evaluation experiment of the proposed position estimation system in an
outdoor environment is introduced. In this experiment, the MUAV and helipad system
was mounted on PWRI’s remote-controlled excavator. The MUAV was controlled
manually, and its estimated position based on the tether status measurement was
compared with the actual position acquired by a total station (Leica Viva TS15).
Figure 5.24 shows a comparison result of the flight path between the estimated
position and the measured position. From this figure, the estimated position by the
XY plane Position estiomation by cable

Ground Truth
4
2 6
y[m]
z[m]
0 2
0
-2 -2
-8 -6 4
-4 -4 2
-2 0 0
-8 -6 -4 -2 0 2 4 -2
x[m] 2 4 -4
x[m] y[m]
XZ plane YZ plane
6 6
4 4
z[m]
z[m]
2 2
0 0
-2 -2
-8 -6 -4 -2 0 2 4 -4 -2 0 2 4
x[m] y[m]
Fig. 5.24 Experimental results. The graph in the upper left shows the flight path of the MUAV
on the XY plane, the graph in the lower left shows the flight path on XZ plane, the graph in the
lower right shows the flight path on YZ plane, and the graph in upper right shows the flight path in
birds-eye-view. In all graphs, the red circles indicate the estimated position of the MUAV, and the
blue squares indicate the measured position of the MUAV
developed system did not greatly differ from the true value. The maximum error
was 1 m for a cable with a length of 7 m. The main component of the error seemed
to have been generated from the angular measurement error of the tether outlet
direction. Therefore, the error became larger when the cable length increased. Note
that the maximum wind speed during the experiment was about 5 m/s. According to
the result, the proposed position estimation method can be applied to autonomous
flight of tether powered MUAV.
Next, the operational experiment using the proposed system is introduced. Until
now, there have been few examples of remotely controlling an excavator with images
from such a high viewpoint, and it was unknown whether such a viewpoint is effective
for remote control of an excavator. Therefore, experiments were conducted to perform
remote-control using images from a high viewpoint obtained from the MUAV. The
target subjects were ten operators who worked excavators during their normal work.
Figure 5.25-left shows an overview of the target environment and the target
unmanned excavator, and Fig. 5.25-right shows an operator who maneuvers the exca-
Tether powered MUAV Aerial image
Helipad
On vehicle image
Unmanned
construction
Target machine Controller
Fig. 5.25 Experimental scene of remote control of construction robot
vator. The operator had a controller for the hydraulic excavator’s operation and per-
formed remote control while watching the images of the on-vehicle camera on the
cabin and the aerial image. The work for the operator is defined as a “model task”
by the PWRI [33]. The task is to move the target to a predetermined position by a
hydraulic excavator, as shown in the Fig. 5.25-left.
As the result of the trials conducted by the ten operators, all operators succeeded
in using images from the multirotor MUAV. In interviews after their operations, their
responses included statements such as “An aerial image allows us to grasp the envi-
ronmental situation for navigation of an excavator, and there is a sense of security.”
As the viewpoint moves drastically, fluctuations in viewpoints due to the MUAV’s
flight might have a negative effect on operability. However, the main opinions of the
operators were “I do not care much.” According to the interviews, it is considered
that the current image stability does not become a big issue, and position movement
for obtaining a free viewpoint is not a large problem either.
According to the observation of the operators’ work, many of the operators mainly
used the aerial image for the excavator navigation, and on-vehicle images for the
lifting task. In the latter case, the aerial image was also used as an adjunct. On the
other hand, some operators focused their attention on the on-vehicle images and did
not only use aerial images.
5.2.5 Summary and Future Works
In this section, a position estimation method for an MUAV with the observation of
a slack tether, an integration of the tether powered MUAV system, and experiments
were introduced.
Until now, flight operators have controlled MUAVs visually. In the near future,
the implementation of automatic flight based on the estimated position of an MUAV
by tether status measurement will be conducted. On the other hand, the MUAV that
is currently used is a conventional one. Originally, it is not assumed that it flies by

connecting it to a tether, so there is a limitation of stabilization of the flight of the
MUAV. Proposing a novel configuration of a tether powered multirotor MUAV is
also an important future work.
5.3 External Force Estimation of Hydraulically Driven

Robot in Disaster Area for High-fidelity Teleoperation
5.3.1 Background
The work machines used at disaster sites are expected to have high power because
they have to perform heavy work such as the removal of large rubble. In addition,
disaster sites are often harsh environments. Therefore, these machines must have
a strong resistance to water and mud, along with shock resistance. Therefore, it is
appropriate to use hydraulic actuators just like construction machines. In addition,
those machines should be controlled remotely in order to prevent secondary disasters
at dangerous disaster sites.
Remotely controlled construction machines have already been developed, includ-
ing those used for unmanned construction at Mt. Fugen, Unzen, Japan. However,
these machines have many characteristics that make them unsuitable for use at dis-
aster sites. First, because hydraulic excavators were originally designed for the exca-
vation of earth and sand, the degree of freedom (DOF) of these machines is too small
to perform work at a disaster site, such as debris removal. In addition, debris removal
requires not only high power but also sometimes very delicate operations.
The current remotely controlled construction machines cannot provide a sufficient
sense of vision and force. Regarding visual information, monocular cameras are often
used, making it difficult to obtain a sense of depth. In the case of on-board operations,
vibration and sound are inherently transmitted to the operator, and such information
helps the operator to understand how much force the machine is applying to the
environment. However, with remote control, such vibration and sound are blocked,
which makes it difficult for the operator to understand how much external force is
applied to the machine.
The lack of such high-fidelity information degrades the work efficiency. In fact, it
is well known that the work efficiency is cut in half compared with direct operation
by an on-board operator [49]. At present, no remotely controlled system has been
established that can achieve a working efficiency comparable to on-board operations.
For the ImPACT Tough Robotics Challenge, we developed a construction robot
platform with a greater DOF than ordinary construction machines. It utilized a high-
response hydraulic servo system. We have also been developing some elemental
technologies and integrating them to realize high-fidelity teleoperation of this con-
struction robot. In this section, we focus on a method to estimate the external force
applied to the construction robot in order to make it possible to implement bilateral

control that feeds back the estimated force to the operator.
It has already been reported that the working efficiency of remotely controlled
construction machinery is improved by force feedback [15]. It is common to mount
a force sensor in the vicinity of the hand, such as on the wrist of the robot, in order
to measure the external load applied to the robot. However, a disaster-response robot
driven by hydraulic actuators can generate large static and impulsive forces. Such
large impulsive forces may easily damage the sensor.
Research on external load estimation without a force sensor mounted on the end-
effector has already been conducted for robots driven by electric motors, such as [12,
48]. These approaches introduce an observer that follows the external load through
the first-order or second-order system, taking the motor load as an input. This is
advantageous in that the external load can be detected without directly measuring
the joint acceleration.
However, because their primary purpose was to detect collisions with workers
or accidental contact with the environment as soon as possible to allow the robot
to promptly take evasive actions, such an approach did not consider an accurate
estimation of the impulsive forces occurring when the robot suddenly collided with
the environment.
For hydraulically driven machines such as construction machines or our construc-
tion robot, it is possible to estimate the joint torques from the pressure differences
measured with hydraulic pressure sensors installed at both ends of the hydraulic
cylinders, and then estimate the external force from these joint torques. Even if a
highly impulsive force is applied, there is little risk of damage to the hydraulic pres-
sure sensors installed at both ends of the hydraulic cylinders, because the hydraulic
fluid plays a buffering role. We can say, therefore, that this is a tough external force
estimation method.
There have been several trials of external load estimation using hydraulic pressure
sensors, such as those by Kontz et al. [25] and del Sol et al. [9]. Konts et al. [25]
assumed that the operation of a construction machine was rather quasi-static, and they
neglected the inertia term in the construction machine dynamics model. Although
they estimated the joint torque based on the identified dynamics model, they did
not conduct an experiment to estimate the external force. Although del Sol et al. [9]
certainly showed a formulation that included the inertia term, they assumed that the
joint angular acceleration could be obtained by the second-order differentiation of the
joint angle, which is not sufficiently accurate. The robot motions actually performed
in their external force estimation experiment were only quasi-static.
Based on the above-mentioned background, we have been trying to make highly
accurate estimations of the external force applied to the end-effector of a construc-
tion robot using hydraulic pressure sensors arranged at both ends of the hydraulic
cylinders, in order to realize high-fidelity force feedback when remotely controlling
the construction robot. First, we attempted to estimate the external force using only
the pressures of the hydraulic cylinders. It turned out, however, that it was difficult to
estimate the impulsive force, although the static external force could be accurately
estimated using only the pressure sensors.
In this study, we developed a method that can accurately estimate the external
force, even if it contains an impulsive component, by considering the inertia term
of the robot calculated from the angular acceleration of each joint, which can be
estimated using accelerometers placed at each link of the robot. In the following, the
formulation of the external force estimation is first shown, and then the verification
results of experiments such as those conducted during a field evaluation forum are
shown.
5.3.2 External Force Estimation of Hydraulically

Driven Robot
5.3.2.1 Dynamics Equation for Serial Link Robot Arm
Our construction robot has a special structure with a double-swing mechanism where
the lower and upper arms, which can be regarded as robot arms, are attached. Like
the front part of a hydraulic excavator, these arms have a closed link mechanism, and
their dynamics are somewhat complicated. It is possible to calculate the dynamics
of the robot arm, including this closed link mechanism, as shown in [28]. However,
because the inertial influence of the closed link mechanism is not very dominant, we
can regard it as a serial-link arm.
In general, the dynamics equation of a serial link manipulator, where an external
force f ∈ m (where m is the dimension of the external force vector) is applied to
the end-effector as shown in Fig. 5.26, is given in a Lagrangian formulation as fol-
lows [51]:
τ + J T f = M(θ )θ̈ + h(θ , θ̇ ) + τfric (θ , θ̇ ) + g(θ ), (5.5)
where τ ∈ n denotes the joint driving force vector, θ ∈ n is the joint angle vector,
M ∈ n×n is the inertia matrix, h ∈ n represents the Coriolis and centrifugal force
term, and g ∈ n denotes the gravity term. τfric ∈ n denotes the friction term, which
includes the Coulomb friction and viscous friction at the joints, and f ∈ m denotes
the external force vector. Matrix J ∈ m×n on the left-hand side of Eq. (5.5) is the
Jacobian matrix of this manipulator, which is defined as follows, depending on the
definition of the end-effector velocity:
v = J θ̇ , (5.6)
where v ∈ m denotes the end-effector velocity.

Up to this point, the external force vector and end-effector vector have been
expressed generally as m-dimensional vectors. Specifically, the end-effector velocity
can be defined either as a linear velocity vt ∈ 3 , i.e. m = 3, or as a six-dimensional
T
vector vtr = vtT ω T ∈ 6 with a linear velocity component and an angular veloc-
ity component. Then, the expression of Eq. (5.6) is rewritten as follows according to
the definition of the end-effector velocity:
vt = Jt θ̇ , (5.7)
vtr = Jtr θ̇ . (5.8)
The external force vector f can be specifically defined as either a linear force vec-
T
tor ft ∈ 3 or six-dimensional force vector ftr = ftT m T ∈ 6 , which includes
a linear force component and moment component, corresponding to the end-effector
velocity definitions for vt and vtr , respectively.
An ordinary construction machine has four joints for swing, boom, arm, and tilt
motions, and it is impossible to estimate all of the components of the translational
force vector and moment vector applied to the end-effector from the cylinder pressure
of each joint. On the other hand, the upper arm of our construction robot has 6 DOF,
and it is theoretically possible to estimate all of the components of the translational
force vector and moment vector applied to the end-effector from the cylinder pressure
of each joint. However, during the operation of the construction robot, it is often the
case that the cylinder of a joint reaches the end point of the motion range. In this case,
the oil pressure value of that cylinder is no longer valid, and the degree-of-freedom
of the arm essentially decreases, which make it impossible to estimate all of the
elements of ftr .
Therefore, we assume that only the translational force ft is applied to the end-
effector of the manipulator, and this ft is the external force that should be estimated.
Then, the expression of Eq. (5.5) can be rewritten as follows:
τload + JtT ft = M(θ )θ̈ + h(θ , θ̇ ) + τfric (θ , θ̇ ) + g(θ ). (5.9)
5.3.2.2 Principle of External Force Estimation
At a certain state of the manipulator (θ , θ̇ , θ̈ ) in Eq. (5.9), the joint torque when no
external force is applied, τfree , is given by
τfree = M(θ )θ̈ + h(θ , θ̇ ) + τfric (θ , θ̇ ) + g(θ ). (5.10)
From Eqs.(5.9) and (5.10), we get
JtT ft = τfree − τ τ̂ . (5.11)
In the above equation, τfree represents the joint torque assuming that no external
force is applied to the end-effector, and it can be obtained by an inverse dynamics
calculation such as the Newton-Euler method [51]. The actual joint torque, τ , can
Fig. 5.26 Serial link

manipulator in contact with
environment
n-th Joint
2nd Joint
3rd Joint
1st Joint
be obtained by measuring the actual cylinder pressures when the external force is
applied.
Because the upper arm of our construction robot has six joints (n = 6), the Jaco-
bian transpose, JtT , becomes a non-square skinny matrix if no cylinder of any joint
has reached the end point. Then, in a case where n > 3, ft can be estimated using
the pseudo-inverse of JtT , ( JtT )† , as shown in the following equation:
ft∗ = ( JtT )† τ̂ . (5.12)
Equation (5.12) is an over constrained problem because there are more than three
equations, which is the number of unknowns. Using the pseudo-inverse makes it
possible to obtain the external force f ∗ that minimizes JtT ft∗ − τ̂ .
When the cylinder of a certain joint reaches the end point, the cylinder pressure
value becomes invalid, and it is necessary to delete the corresponding row of the
Jacobian matrix. Namely, we regard the joint that reaches the end point of the cylinder
as a fixed joint. As long as n ≥ 3 holds, where n (< n) denotes the number of
effective joints after several cylinders have reached their endpoints, it is possible to
estimate the linear force ft using Eq. (5.12).
Moreover, when the reliability of the measured torque value of a certain joint is
low, we can put a lower weight on that value. To do so, we introduce the diagonal
weight matrix W , where the value of the element corresponding to the joint with low
confidence is set small. Multiplying both sides of Eq. (5.11) by this weight matrix
W from the left, we get
W JtT f = W τ̂ . (5.13)
From this, the external force can be estimated using the following equation:
Fig. 5.27 Overview of construction robot experimental test bed at Kobe University
ft∗ = (W JtT )† W τ̂ . (5.14)
Using Eq. (5.14) makes it possible to obtain the estimated external force f ∗ that
minimizes the weighted norm W ( JtT ft∗ − τ̂ ). In the following, unless otherwise
noted, the weight matrix W is set to an identity matrix.
In the experiment described below, we assume that the term for the Coriolis and
centrifugal forces is negligibly small when calculating τfree using Eq. (5.10). Note,
however, that the contribution of the friction term in the construction machine is too
large and should not be ignored. Therefore, in the following experiments, τfree is
obtained by the following equation:
τfree = M(θ )θ̈ + τfric (θ , θ̇ ) + g(θ ). (5.15)
In addition, it is possible to obtain τfree using the following simplified equation,

where we neglect the inertia term, for the purpose of comparison:
τfree = τfric (θ , θ̇ ) + g(θ ). (5.16)
Hereafter, we use the term Method AP for the method of estimating the external
force by substituting the τfree value obtained by Eq. (5.15) into Eq. (5.11), namely,
the method that uses not only the cylinder pressure but also the accelerometer infor-
mation. On the other hand, we use the term Method P for another method that uses
Eq. (5.16) instead of Eq. (5.15), namely, the method that does not use accelerometers.
Fig. 5.28 Two-link

Manipulator boom link arm link
force plate
5.3.3 Experimental Results Using Dual-Arm

Construction Robot
5.3.3.1 Experiment Using Test-Bed in Kobe University
The dual-arm construction robot was revealed to the public for the first time at the
field evaluation forum in November 2017. We had been using a single arm construc-
tion robot, the predecessor of the dual-arm construction robot, and conducted sev-
eral experimental verifications with this robot in the earlier field evaluation forums.
Because the opportunities to conduct verification experiments using these construc-
tion robots tend to be limited, with the field evaluation forum held twice a year and
preparation stages held just before the forum, we developed an experimental test
bed based on a mini-size excavator and conducted preliminary verification experi-
ments at Kobe University before implementing our method in construction robots.
In this section, we will first introduce the results of experiments conducted using this
experimental test bed at Kobe University.
Figure 5.27 gives an overview of the experimental test bed at Kobe University. This
test bed was based on a mini-excavator (PC-01 by KOMATSU). It has a hydraulic
pump that is motorized, which makes indoor experiments possible. An additional
linear encoder was attached to each cylinder to measure the length of the cylinder
rod, and pressure sensors at both ends of each cylinder made it possible to measure
the cylinder pressure.
For the sake of simplicity, we will only consider the boom link and arm link,
which makes it possible to regard the test-bed as a 2 DOF manipulator, as shown in
Fig. 5.28. Hereafter, the vertical surface where the boom link and the arm link move
will be called a sagittal plane.
In the Newton-Euler method, the velocity and acceleration of each link are first
calculated from the base link toward the end link as the forward calculation. However,
because the accelerometer can measure the absolute acceleration with respect to the
world coordinate frame, which is the inertial coordinate system, it is possible to
perform the forward calculation from the link where the first joint is located at the
bottom side, if we can accurately determine the acceleration of this link by attaching
multiple accelerometers to it.
Therefore, multiple accelerometers were attached to the boom link in this case.
Because the arm movement was restricted to the sagittal plane, three accelerometers
(AS-20B by Kyowa Electric Industry Co., Ltd.) were attached to the boom link, as
Middle of the boom link
Middle of the arm link
Fig. 5.29 Accelerometers installed on robot experimental test bed at Kobe University
shown in Fig. 5.29. From the values of these three accelerometers, the acceleration of
the boom link, including two translational components and one rotational component,
could be measured. Because the movement of the arm link had only 1 DOF relative
to the boom link, just one accelerometer (AS-20B by Kyowa Electric Industry Co.,
Ltd.) was attached to measure the acceleration of the arm link, in such a way that it
was sensitive to the acceleration caused by the angular acceleration of the arm joint.
The signals from the cylinder rod encoders, cylinder pressure sensors, and
accelerometers could all be acquired within a cycle of 1 [ms]. The test-bed robot
could be operated with a joystick, and the true value of the external force was mea-
sured with a force plate. In the experiment, the tip of the bucket of the robot was
pushed vertically against the force plate from a height of approximately 0.4 [m], and
the pushing motion was continued for approximately 0.7 [s].
The experiment was conducted by comparing the method using the accelerometers
and cylinder pressure (method AP) and the method using only the cylinder pressure
without using the accelerometers (method P). Figure 5.30 shows the estimated force
in the vertical direction, together with the true value measured by the force plate in
that direction.
Method AP
Measured (Truth)
(a) Method AP
Method P
Measured (Truth)
(b) Method P
Fig. 5.30 Experimental results using test bed at Kobe University
As shown in Fig. 5.30, a large impulsive force of approximately 1,500 [N] was
measured by the force plate when the end-effector hit it. According to the specifi-
cation for the PC01, which was the base machine of our experimental test bed, the
maximum excavating force is 4,000 [N]. Thus, the impulsive force at the instant of
contact corresponds to approximately 38% of the maximum excavating force of this
equipment.
The peak of the impulsive force could be effectively estimated using the method
with the accelerometers (method AP), although the magnitude of the estimated force
slightly differed from the true value. On the other hand, in method P, which only used
the cylinder pressure, the peak of the impulsive force could not be estimated at all.
These results confirmed that it was possible to accurately estimate the external force,
including the impulsive force, by attaching accelerometers to the robot. However,
it should be noted that even with method P, the static force after the impact could
be accurately estimated. From this, it can be seen that it is sufficient to estimate the
external force using only the cylinder pressure when it does not contain the impulsive
force component due to an instantaneous change in the momentum of the arm.
Because the effectiveness of the proposed method was confirmed on the test bed
at Kobe University, we implemented this external force estimation method on the
dual-arm construction robot and performed a haptic feedback experiment using the
estimated force, as discussed in the next section.
5.3.3.2 Field Evaluation Forum in November 2017
In November 2017, we held a field evaluation forum at Tohoku University and

conducted several evaluation experiments using the dual-arm construction robot.
Because the details of the dual-arm construction robot itself are described in the
previous section, they are omitted here.
The external force estimation and bilateral control were performed using the
upper arm of the dual-arm construction robot. The arrangement of the accelerometers
attached to the upper arm of the dual-arm construction robot is shown in Fig. 5.31.
In this experiment, the acceleration measurement was limited to the movement in
the sagittal plane of the upper arm, as in the experiment using the test bed at Kobe
University. Therefore, two accelerometers (AS-10GB by Kyowa Electric Industry
Co., Ltd.) were attached to the middle part of the boom link so as to be orthogonal to
each other. The translational acceleration of the boom link could be measured mainly
using these accelerometers. In addition, one accelerometer (AS-10GA by Kyowa
Electric Industry Co., Ltd.) was attached to the top end of the boom link to measure
its angular acceleration. Accelerometers (AS-20GB by Kyowa Electric Industry Co.,
Ltd.) were attached to the arm link and tilt link facing in the tangential directions
of rotation of the respective joints. The outputs from these accelerometers, as well
as the cylinder length and hydraulic pressures at both ends of each shaft cylinder,
were acquired by the on-board controller and transmitted to the remote cockpit in a
10 [ms] cycle through a wireless network. The system configuration for the bilateral
control of the construction robot is shown in Fig. 5.32.
In this research, we will eventually attach accelerometers to all of the links in order
to estimate ftr , namely, the external force vector having not only the translational
force component but also the moment component. In this experiment, however, we
did not attach accelerometers to links farther than the tilt link. Therefore, in method
AP, which used the accelerometers, the angular acceleration of a joint farther than
the tilt link was regarded as 0.
One of the movements of the robot in the experiment was to hit a concrete block
that was placed in an area (with gravel on the top of the block so that the block could
not be seen) surrounded by timbers, as shown in Fig. 5.33, several times by moving
the end-effector in the vertical direction. Another movement was to hit the gravel
on the ground (without a concrete block) several time in a similar way. The two
methods, method AP using the accelerometers in addition to the cylinder pressures
and method P using only the pressure sensors, were compared.
Top of the boom link

Middle of the arm link
Middle of the boom link
Fig. 5.31 Accelerometers installed on upper arm of dual-arm construction robot
Remote control
cockpit
Desktop PC
Force (Komatsu and Osaka Wireless

feedback Valve
Univ.) c
M opening
command Signal command
conversion LAN Onboard
port computer
Signal
Force conversion
device command
External force
External force
e Cylinder pressure,
(Kobe Univ.) joint angle
and
Fig. 5.32 System configuration for bilateral control of dual-arm construction robot
Buried concrete block
Fig. 5.33 Overview of experimental site
The end-effector of the robot arm was positioned approximately 1.2 [m] above the
ground, and then moved to hit the ground approximately 5 times. Figure 5.34 shows
the estimation results in the z-axis direction (vertical direction). Particularly with
method AP using the acceleration sensor, a sharp peak appears in the plot, showing
that a large impulsive force at the moment of contact can be estimated. However, one
can see that the peak was not observed at the first contact even with method AP. This
was presumably because data sampling with a period of 10 [ms] may sometimes
fail to acquire the peak of an instantaneous change in acceleration. In method P, on
the other hand, the impulsive force could not be estimated. Note, however, that the
static pushing force after collision could be estimated as with method AP. In this
experiment, because we could not bring the force plate into the field, we could not
compare the estimated value with the true value.
In the case of the motion to hit the gravel on the ground, the end-effector hit
the ground approximately 8 times from approximately 1.3 m above the ground.
Figure 5.35 shows the estimation result for the z-axis direction (vertical direction).
In this case, a small impulsive force was observed at each hit only by method AP
using the acceleration sensor. With method P, on the other hand, the impulsive force
could not be observed at all, and the estimated force profile looks similar to that in
the case of hitting the concrete block. Therefore, method AP is expected to be able to
distinguish between a concrete block and gravel from the estimated contact reaction
force alone.
Actually, the operator who performed the teleoperation with bilateral control had
the impression that the force feedback for method P seemed to be a soft even when
hitting a hard floor. In method AP, on the other hand, the operator could feel a crisp
force close to the impulsive force when hitting a concrete block. When hitting the
(a) Method AP
(b) Method P
Fig. 5.34 Experimental results (hard contact)
gravel ground, their impression was that the sense of impact was weaker than with
the concrete block.
We set the force scale so that the impulsive force (a maximum value of 15,000
[N]) could be presented within the maximum output (7.9 [N]) of the haptic device
(Phantom Desktop made by GeoMagic). Under this force scaling, however, the mag-
nitude of the displayed force for the static pushing force became so small that it
could hardly be felt by the operator. Therefore, it is necessary to introduce indepen-
dent scaling ratios for the impulsive force and static force in the future. In the future,
it is also necessary to compare the results of cognitive tasks such as contact judgment
and material discrimination by multiple operators between method AP and method
P, with appropriate settings for the force scale ratios for the impact force and static
force.
(a) Method AP
(b) Method P
Fig. 5.35 Experimental results (soft contact)
5.3.3.3 Field Evaluation Forum in May 2018
In May 2018, we held a field evaluation forum in the Fukushima Test Field and
presented a demonstration using the dual-arm construction robot. In this demonstra-
tion, the pilot cockpit was totally renovated and the 6-axis haptic device shown in
Fig. 5.36 and a cylindrical screen were introduced. Figure 5.37 shows a snapshot
of the demonstration presented at the field evaluation forum, simulating search and
rescue work at a collapsed house.
The details of this demonstration are described in other sections. Like the last
demonstration in November 2017, the external force applied to the end-effector of
the upper arm was estimated, and a bilateral control system was configured using
the estimated external force. In the demonstration of the forum this time, the upper
Fig. 5.36 6-DOF haptic device
Fig. 5.37 Snapshot of rescue operation demonstration in field evaluation forum in May 2018
arm was used for peeling off the roof of the collapsed house, as shown in Fig. 5.37.
To peel off the roof, it was first necessary to hook the lower end of the roof with the
tip of the end-effector of the upper arm. Because it was difficult for the operator to
capture the depth information using the image from a camera installed at the tip of the
upper arm, the reaction force fed back to the operator through the haptic device when
the end-effector contacted the roof was very helpful when approaching the roof. In
addition, when peeling off the roof in an upward direction after firmly grasping the
lower end of the roof, force feedback made it possible for the operator to confirm that
the task was progressing normally, while preventing them from applying excessive
force to the environment.
Although the detailed results cannot be shown here as a result of the space limi-
tation, the effectiveness of the external force estimation and bilateral control using
the estimated external force was demonstrated at the field evaluation forum in May
2018.
5.3.4 Conclusion
In this section, we showed a method to accurately estimate the external force applied
to the end-effector of the dual-arm construction robot using hydraulic pressure sen-
sors arranged at both ends of the hydraulic cylinder of each joint of the robot, so that
the estimated force can be fed back to the operator to realize high-fidelity teleoper-
ation of the construction robot.
Although the static force could be accurately estimated, it was difficult to estimate
the impulsive force using only the hydraulic cylinder pressure. In this study, we
developed a method to accurately estimate the external force even if it contains an
impulsive component, by considering the inertia term of the robot calculated from the
angular acceleration of each joint, which was estimated from accelerometers placed
at each link of the robot. The developed method was verified using the test bed at Kobe
University, as well as in the dual-arm construction robot. In addition, demonstrations
were conducted under a realistic scenario at the field evaluation forums.
In future work, it will be possible to estimate not only the translational components
but also the moment components of the external force vector. When the cylinder
reaches the end point, however, the corresponding cylinder pressure becomes invalid,
and special care is necessary. Moreover, by implementing bilateral control not only
for the upper arm but also for the lower arm, it will be possible to perform a dual arm
coordinated manipulation task with force feedback from both the upper and lower
arms.
In the construction robot group, Konyo and his colleague have been developing
a method to provide tactile feedback to the operator through a vibration actuator
embedded in the wristband worn by them. Vibration sensors were installed near the
end-effectors of the upper arm and lower arm, separately from our accelerometers,
and the measured vibration was converted to a signal with a different frequency so
that the operator could easily perceive it. Because this tactile feedback and our force
feedback contain complementary information, in future work, we will also improve
the fidelity of the teleoperation by integrating these two methods.
5.4 Tactile Feedback
5.4.1 Introduction
Operations of construction machines are frequently accompanied by haptic inter-

actions, e.g., digging in the ground, and handling heavy loads. However, remote
operations of construction robots lack haptic feedback, which reduces the usabil-
ity/maneuverability.
High-frequency contact vibrations often occur and they are important tactile cues
to represent the details of contact characteristics. Humans usually perceive environ-
ments using both tactile and kinesthetic cues, and it has been reported that the realism
of contacting materials is improved by adding tactile cues to kinesthetic ones [26].
Several researchers have reported that, by representing high-frequency vibrations,
humans can perceive the properties of the simulated materials, e.g., textures and stiff-
ness [7, 37, 50]. The vibrotactile feedback system for telesurgery is proposed and
evaluated qualitatively [31], which is an example that vibrotactile cues can support
teleoperation.
This section shows a transmission system of high-frequency vibration for sup-
porting the teleoperation of construction robots.
The first characteristic of the vibrotactile transmission system is the modulation
methodology of high-frequency vibration. The frequency range of the vibrations
measured on a construction robot is often out of the sensitive range for humans. For
example, it was observed that the contact vibrations measured on the metal arm during
digging in the ground contained the peak frequency over 2 kHz. This frequency range
is higher than the human sensitive range which is approximately 40–800 Hz [3]. The
methodology can modulate a high-frequency signal to the signal which is sensitive
for humans while providing important information of contact vibrations such as
material properties.
The second characteristic of the vibrotactile transmission system is the easy imple-
mentable sensor and display components, which leads to high versatility and durable.
The sensor and display are available to be applied to an existing teleoperation sys-
tem of a construction robot without the alteration of the system. The sensing system
measures the vibration propagated on the body of construction robot; therefore, an
attaching position can be located at a distance from a tip manipulating position and it
leads to be difficult to break. The wristband-shaped vibrotactile display enable us to
perceive vibrotactile feedback signals without disturbing several types of interfaces
such as a joystick, a joypads, and a force feedback interface.
5.4.2 Modulation Methodology of High-Frequency Vibration
Human perceptual characteristics, in which humans perceive the material proper-

ties from the vibrotactile signals, have been investigated. Some researchers reported
that humans can perceive the material characteristics by tapping based on the enve-
lope (decay rate) and the carrier frequency of the transient vibration [14, 37]. Other
researchers reported the relationships between the frequency of the vibratory signals
and the material properties [2, 36].
These findings led to the idea that envelopes of vibratory signals are important
for humans to perceive contact environments. Therefore, the developed methodol-
ogy focused on transmitting the envelope information so the operator can perceive
(a) Original vibrotactile signal v(t)

Amplitude [V]
(b) Upper and lower envelopes of original signal eu(t), el(t)
Time [s]
(c) Amplitude modulated vibration vam(t). Carrier frequency: Frequnecy in
sensitive range for humans. Amplitude: Upper and lower envelopes
Fig. 5.38 The process of modulation of high-frequency vibration
these contact characteristics. Even though the peak frequency range of the measured
vibratory signals is too high for humans to detect [3], the envelope of the signals’
amplitude is often not a high-frequency, which trend was observed in our measure-
ments, as shown in Fig. 5.38a, b. Therefore, humans may more clearly perceive the
contact characteristics from the envelope of the original signal, shown in Fig. 5.38b,
rather than the original signal.
In addition, human perceptual characteristics regarding the vibrotactile signals’
envelope have been investigated in the past. Researchers reported that humans can
perceive the envelope of an amplitude-modulated vibration whose carrier frequency
is higher than 1 kHz [27, 29]. We focused on this human perceptual characteristic
regarding the envelope of the vibrotactile signals. If the amplitude of the sinusoidal
vibration, whose carrier frequency is within the human-sensible range, is modulated
with the envelope of the original signals’ amplitude, as shown in Fig. 5.38c, humans
may clearly perceive the contact characteristics, rather than the simple envelope
signals shown in Fig. 5.38b.
5.4.2.1 Pre-process: Subtraction of Noise Vibration
Measured vibrotactile signals often contain noises that accompany the operation of
construction robots. Therefore, before the main process, a noise vibration will be
extracted from a measured signal using noise subtraction technique. McMahan et al.
proposed the method which the spectrum of vibration at the non-contact condition
was subtracted from that of the measured vibration when a robot is active [30]. Since
noise signals also have been observed in the construction robot, this method was
adopted. In this method, a measured signal are transformed into frequency domain
information (power spectral) using a short-term Fourier transform (STFT), and then
the power of a noise signal, which is defined previously, is subtracted from that of the
measured signal. Calculated frequency domain information is then transformed into
time domain information using an inverse short-term Fourier transform (ISTFT).
5.4.2.2 Main Process: Envelope Extraction and Amplitude Modulation
The process flow is shown in Fig. 5.38. We extract an upper envelope eu (t) and a
lower one el (t) by finding the points where an original signal v(t) is convex upward
and downward, and separately applying linear interpolation to those points. Then,
an amplitude modulated signal vam (t), shown in Fig. 5.38c, is determined by the
following:
vam (t) = A(t) sin(2π f t) + vo (t), (5.17)
A(t) = (eu (t) − el (t))/2, (5.18)
vo (t) = (eu (t) + el (t))/2, (5.19)
where A(t), f , and vo (t) are the amplitude of the modulated vibration, the frequency
whose range is sensible by humans as the carrier frequency, and the offset of the
signals, respectively.
5.4.3 First Experiment: Discriminability
5.4.3.1 Objective
An evaluation experiment was conducted to investigate whether the proposed method

improved the human’s discriminability of the properties of contacting materials and
the operating characteristics of a moving manipulator.
5.4.3.2 Participants
Six volunteers participated in the experiment. They were not aware of the purpose
of the experiment.
5.4.3.3 Apparatus
The apparatus for measuring vibrations is shown in Fig. 5.39a, b. A vibration sen-
sor (NEC TOKIN, VS-BV203) is attached to the handle of a shovel. The sampling
frequency for the vibration sensor is 50 kHz. The shovel is slided by a 1-DoF linear
actuator (SMC Corporation, LEFS40B-1000).
(a) (b) (c)

Vibration
Vibration direction
1 DoF linear actuator sensor
Movement direction
Vibrator
Material Shovel
Vibration sensor
Fig. 5.39 a Apparatus for recording the vibrations of the shovel digging the material with a one-
DoF sliding motion. b Vibration sensor attached to the shovel. c Vibrotactile display for generating
the vibrotactile stimulus
Amplitude [V]
1.0 1.0
0 0.5
-1.0 0
0.5 1.0 1.5 2.0 2.5
(a) Original vibrotactile signal
Amplitude [V]
1.0 1.0
0 0.5
-1.0 0
0.5 1.0 1.5 2.0 2.5
(b) Upper envelope of original signal
Amplitude [V]
1.0 1.0
0 0.5
-1.0 0
0.5 1.0 1.5 2.0 2.5
Time [s] Frequency [Hz]
(c) Amplitude modulated vibration
Fig. 5.40 Examples of three types of vibrotactile signal for the evaluation experiment
5.4.3.4 Procedure and Stimulus
Six types of vibrotactile signals were measured, including three types of mate-
rials (small-size gravel, large-size pumice, and small-size pumice) and two slid-
ing velocities (50 and 200 mm/s). From each of the six signals, five vibrotactile
stimui (2.5 s) were extracted, where four stimuli were used for the actual trials and
one was used for the training trials. In addition, the three types of modulation shown
in Fig. 5.38 are applied to stimuli. The carrier frequency f of method (c) was set to
550 Hz. Examples of the three types of modulated vibrotactile stimulus are shown in
Original Envelope Amplitude modulated

vibration
1.0 1.0 1.0
0.8 0.8 0.8
0.6 0.6 0.6
0.4 0.4 0.4
0.2 0.2 0.2
Correct answer ratio
Chance level: 1/6

0 0 0
(a) Velocity: 50 mm/s, (b) Velocity: 50 mm/s, (c) Velocity: 50 mm/s,
Material: Small size gravel Material: Larze size pumice Material: Small size pumice
1.0 1.0 1.0

0.8 0.8 0.8
0.6 0.6 0.6
0.4 0.4 0.4
0.2 0.2 0.2
0 0 0
(d) Velocity: 200 mm/s, (e) Velocity: 200 mm/s, (f) Velocity: 200 mm/s,
Material: Small size gravel Material: Larze size pumice Material: Small size pumice
Fig. 5.41 Correct answer ratios using three types of vibrotactile signals for each condi-
tion (velocity×material). ∗∗: p < .01, ∗: p < .05
Fig. 5.40. For the noise-information-subtraction preprocess, we used the vibrations

made while the shovel was sliding around without colliding with the material.
In one trial, a single vibrotactile stimulation (2.5 s) was presented to a participant,
and then the participant answered which of the six conditions was the same as the
provided stimulus. Before the actual trials, the participants dug in the three mate-
rials using a shovel and perceived the haptic sensations of the six stimuli. Then, as
training trials, they experienced the six types of training stimulus three times with a
synchronized movie, and then conducted 10 actual trials without recording.
In the actual trials, all six stimuli in the same set were presented to each partic-
ipant 10 times. Sixty trials were conducted for each set. In totally, 180 trials were
conducted for each participant, excluding the training trials. After every 10 trials,
the participants retrained using the actual stimuli three times. The participants took a
five-minute break between the three types of stimuli sets. The entire experiment took
approximately 80 min. The order of the stimuli sets and the stimuli in each stimuli
set were randomized for each participant.
5.4.3.5 Results
Figure 5.41 shows the correct answer ratios at the six conditions. To investigate which
modulation methods improved the discriminability of the stimuli, multiple compari-
Table 5.1 Ratios of discriminable conditions at which correct answer ratios are higher than the
chance level
Type of vibrotactile Original Envelope Amplitude modulated
signals vibration
Ratio of discriminable 1/6 3/6 5/6
condition
son tests using the Bonferroni method were performed. First, for the 50 mm/s velocity
and the small-size gravel condition, there were significant differences between the
original and envelope signals ( p = .033) and between the original signals and the
amplitude-modulated vibration ( p = .007). Second, for the 200 mm/s velocity and
the small-size pumice condition, there were significant differences between the orig-
inal signals and the amplitude-modulated vibration ( p = .029).
In addition, we measured the number of discriminable conditions in which the
correct answer ratios were higher than the chance level (1/6) of the answer ratios.
The result is shown in Table 5.1, which indicates that the proposed methodology
improved the discriminability of the contact conditions.
5.4.4 Second Experiment: Sensory Evaluation
5.4.4.1 Objective
The first experiment elucidated that the proposed method improved the transmitted
information and humans can discriminate material properties of contact surfaces
and motion velocity of robot. However, at the more practical situation, the subjective
feelings affected by the proposed methods were not uncertain. Therefore, by using the
measured vibrations on a construction machine during interactions, we investigate
the effects of the proposed methods in terms of various evaluation items through
subjective evaluation.
5.4.4.2 Participants
Four volunteers participated in the experiment, who were not same to those of the
first experiment and were not aware of the purpose of the experiment.
5.4.4.3 Apparatus
Vibrotactile signals on the construction machine were measured with the synchro-
nized movie as shown in Fig. 5.42. The vibration sensor is same to that used in the
Vibration sensor
Fig. 5.42 Measurement condition on an actual construction machine. A vibration sensor is attached
to the arm of excavator
1
0
Amplitude [V]
1
Power [V2]
-1
0 5 10 15 0.5
1
0
0 101 102 103
-1
2.54 2.56 2.58 2.6 2.62 2.64
(a) Original
Amplitude [V]
1 1
Power [V2]
0 0.5
-1 0
2.54 2.56 2.58 2.6 2.62 2.64 101 102 103
(b) Envelope
Amplitude [V]
1 1
Power [V2]
0 0.5
-1 0
2.54 2.56 2.58 2.6 2.62 2.64 101 102 103
Time [s] Frequency [Hz]
(c) Amplitude modulated vibration
Fig. 5.43 Four types of vibrotactile signals for the second experiment
first experiment and the sampling frequency is 5 kHz. The movie shows the construc-
tion machine’s excavation work and the viewpoint is fixed. The vibrotactile signals
are delivered to the participants by the same apparatus in the first experiment.
5.4.4.4 Procedure and Stimulus
Three types stimuli (16 s) were prepared for the second experiment by using the
proposed modulation method as shown in Fig. 5.43. For the pre-process of noise
subtraction, we used the vibrations at the time of just driving the engine without arm
movement. The constant carrier frequency f c was 300 Hz.
As one trial, a vibrotactile stimulus (16 s) and a synchronized movie are presented
to a participant, and then the participant answers the five questionnaires as described
below with a seven-point scale (from 1 to 7). The participants can experience a
stimulus several times until they decide the answer.
Q.1: How similar did you perceive the vibrotactile feedback while the construction
machine dug the ground, to the one you expected?
Q.2: How strong did you perceive the vibrotactile feedback while the construction
machine collided with the ground?
Q.3: How strong did you perceive the vibrotactile feedback while the construction
machine was digging the ground?
Q.4: How much do you think that the vibrotactile feedback hinder the operation?
Q.5: How much synchronous was vibrotactile feedback with the motion of the
construction machine?
Before trials, the participants experienced all three stimuli with the five question-
naires several times as training trials. Then, 30 trials (ten trials for each stimulus) are
conducted for each participant. The participants took five minutes break time after
20 trials. The entire experiment took approximately 45 min. The order of presented
stimuli was randomized for each participant.
5.4.4.5 Result
The evaluation values for the five types of questionnaire are shown in Fig. 5.44.
A two-way ANOVA indicated the significant differences among the three types of
vibrotactile stimuli and the participants for all five questionnaires.
There are several significant differences as shown in Fig. 5.44; however, it is
difficult to discuss all differences. Therefore, some results were focused.
First, for Q.1, the evaluations to the amplitude modulated vibration is significant
higher than the envelope vibration ( p = 0.019, Bonferroni). For providing realistic
digging sensations like textures of materials, the amplitude modulated vibration is
appropriate, but there is no significant difference with the original signal.
For Q.2, the amplitude modulated and envelope vibrations showed the evaluation
values higher than original signal ( p = 1.1 × 10−41 and p = 1.4 × 10−13 , Bonfer-
roni, respectively). The results show that the amplitude modulated vibration and enve-
lope vibration improve the reality of collisions with surfaces, which is an advantage
for teleoperation because operators can perceive contact with environments clearly.
Next, in terms of Q.3, the amplitude modulated and envelope vibrations showed
the significant higher values rather than the original signal ( p = 3.7 × 10−16 and
Original Envelope Amplitude modulated vibration

***
Eveluation value
Eveluation value
7 7 *** ***
5 5
3 3
1 1
*
(a) Q1: How similar was the vibrotactile (b) Q2: How strong did you perceive the
feedback while the construction machine vibrotactile feedback while the construc-
dug the ground, to the one you expected? tion machine collided with the ground?
***
Eveluation value
Eveluation value
7 *** *** 7 *** ***
5 5
3 3
1 1
(c) Q3: How strong did you perceive the (d) Q4: How much do you think that the
vibrotactile feedback while the construc- vibrotactile feedback hinder the operation?
tion machine was digging the ground?
Eveluation value
7 *** ***
5
3
1
(e) Q5: How much synchronous was the vibrotactile feedback
with the motion of the construction machine?
Fig. 5.44 Subjective evaluation values for the five types of questionnaire. ∗ ∗ ∗: p < .001,
∗: p < .05
p = 7.6 × 10−5 , Bonferroni, respectively). These results supported that the modu-
lations increased perceptual strength of contact with surfaces.
Above results suggested that the amplitude modulated and envelope signals show
good performance for providing high presence information. However, for Q.4 and
Q.5, the performance of the envelope signal is not good. For Q.5, which is related to
perceptual synchronization with the visual feedback, envelop signals showed sig-
nificant lower values rather than the amplitude modulated and envelope signals
( p = 7.9 × 10−4 and p = 2.2 × 10−4 , Bonferroni, respectively). These results may
depend on that the envelope signal contains low frequency components comparing
with the other signals and the participants may expect different motion velocities or
movements.
The envelope signals may disturb the operation although it is expected that it
can provide the information of properties of contacting environment. On the other
hand, the amplitude modulated signals are successful for enhancing human percep-
tual characteristics such as the reality of contact, the synchronization with visual
feedback, and discriminability of material properties and own motions.
(a) (b)
Vibrotactile display
Vibration
sensor
Fig. 5.45 a Vibration sensors attached on the arm of the construction robot. b A wristband-shaped
vibrotactile display for an operator
5.4.5 Implementation of Vibrotactile Transmission System

on Construction Robot
The vibrotactile transmission system adopting the modulation method is imple-

mented for the teleoperaton system of the construction robot as shown in Fig. 5.45.
The following vibrotactile sensor and display systems are available to be applied to
existing teleoperation systems without the alteration of the systems.
5.4.5.1 Vibrotactile Sensor System
As shown in Fig. 5.45a, the sensor box containing a piezoelectric vibration sen-
sor (NEC TOKIN, VS-BV203) is attached at the position slightly away from the
manipulator of the robot. It measures the vibration propagated from the manipulator,
so that vibrotactile information can be acquired without the direct contact between
the sensor and the environment. Vibration signals are sampled at 8 kHz.
5.4.5.2 Vibrotactile Display System
As shown in Fig. 5.45b, an operator wears a wristband-shaped vibrotactile display

containing a voice coil actuator (Vp2, Acouve Laboratory Inc.), which can reproduce
vibrotactile signals in a wide frequency band can be reproduced. The transmitted
signal from the robot is modulated by using the developed methodology. It is available
with several types of interfaces such as a joystick and a joypad.
5.4.6 Conclusion
This section showed the transmission system of the high-frequency vibration signals
for supporting the teleoperation of construction robots. The modulation methodol-
ogy of high-frequency vibration as the key technology of the system was explained
and evaluated through the two experiments. The methodology modulates high carrier
frequency of an original signal to human sensitive carrier frequency while maintain-
ing maintaining the envelope of the original signal, which can effectively represent
the properties of contact environments. Then, the vibrotactile transmission system
implemented on the construction robot was showed. The system containing the easy
implementable sensor and display components is available to be adopted with exist-
ing teleoperation systems without the alteration of the systems.
5.5 Visual Information Feedback
A visual information feedback is critical functionality for tele-operated robots. We

have developed an arbitrary viewpoint visualization system for tele-operated disaster
response robots. This system can visualize a free view point scene with a 3D robot
model posture which is synchronized with the real robot posture. It greatly improves
an operability of tele-operated robots. We have also developed fail safe system.
The developed system can visualize the scene even if a camera is broken. A visible
camera, or an RGB camera, is usually used to obtain the visual information. However,
it is very challenging to obtain the visual information in a dark night scene and/or a
very foggy scene. For such severe conditions, a longwave infrared (LWIR) camera is
very effective. We have developed the visible and LWIR camera system to observe
surrounding of the robot.
5.5.1 Arbitrary Viewpoint Visualization System
When operators tele-operate robots, they need to monitor the surroundings of the
robot (Fig. 5.46). Therefore, the operators monitor it displays from cameras or
panoramic camera sets on the robot and operate while confirming its safety [19].
However, tele-operation reduces operability. Therefore, we propose three methods
for preventing operability reduction.
The first method is an arbitrary viewpoint visualization system. One of the causes
of operability reduction is blind spots. Therefore, camera images are integrated into
one large image for reducing blind spots [38]. Another cause is first person view. It
is easier to operate in third person view rather than first person view. Okura proposed
an arbitrary viewpoint system using a depth sensor and cameras [38]. Depth sensor is
used for generating a shape of the 3D model where images are projected. However,
the construction robot may move the objects in the disaster areas. In other words, the
surrounding environment may change due to the working of the robot. Therefore,
precise model is not needed when the robot is tele-operated.
General cameras have a narrow field of view, which can lead to blind spots in the
range close to the robot. As near field of view is the most important for operating,
we use fish-eye cameras. Because a fisheye camera has a very wide field of view,
we can acquire instead near field of view. Although because fisheye cameras have
distortion peculiars, even if we integrate the raw image, a high visibility image cannot
consequently be obtained. Therefore, we propose an arbitrary viewpoint visualization
system by using fish-eye cameras (Fig. 5.47).
The second method is a robot direction display system. We sometimes lose the
robot is heading direction when using arbitrary viewpoint visualization. This is one
of causes that reduce operability. Therefore, we propose a robot direction display
system in arbitrary viewpoint visualization.
The third method is a bird-eye view image generation even under camera mal-
function. The robot sometimes malfunctions in disaster areas. When cameras mal-
function, operability is markedly reduced. Fish-eye cameras have a field of view
of 180◦ or more. Therefore, theoretically, we can obtain a field of view of 360◦ by
attaching two cameras to both sides of the robot by using 4 cameras. In other words,
even if the camera breaks down, we can interpolate a failed camera image by another
camera [23]. Therefore, we propose a bird-eye view image generation with camera
malfunction system.
5.5.1.1 Arbitrary Viewpoint Visualization [18]
In this section, we explain the proposed method. The arbitrary viewpoint visualization
is generated as follows;
1. We generate a shape of the 3D model for projecting the surrounding environment
as shown in Fig. 5.48.
2. We integrate fish-eye camera images into a bird-eye view image as in [41].
3. We project the image to the 3D model.
4. We set a 3D robot model at the center of the 3D model.
We generate a shape of the 3D model for displaying the surrounding environment.
The surrounding environment is approximated as a hemispherical model (Fig. 5.48).
The hemispherical model consists of a ground plane for monitoring the near area
and a spherical face for monitoring distant area.
As shown in Fig. 5.49, the robot has four fish-eye cameras for monitoring. We
contruct the arbitrary viewpoint visualization system by the following procedure.
As fish-eye camera images are distorted, we reduce these distortions as detailed
in [16, 42].
We transform the converted image into a vertical viewpoint image by perspec-
tive projection transformation. We integrate the images into a bird-eye view image.
However, the bird-eye view image has gaps between each camera. Therefore, we
Fig. 5.46 Construction

robot
Fig. 5.47 Arbitrary

viewpoint visualization
system
Fig. 5.48 3D hemispherical

model for displaying the
surrounding environment
Fig. 5.49 Fish-eye cameras

for monitoring
Fig. 5.50 Calibration using

square boards
calibrate each camera before generating the bird-eye view image. We calculate inter-
nal parameters as in [52]. We estimate the external parameters between the local
camera coordinate system and the world coordinate system by using square boards
in the common visible region of 2 cameras (Fig. 5.50). Therefore, we use the square
board’s four corners as reference points. As images can not retrieve the absolve be
scale of objects, we need for setting up a scale. Because the boards size is known,
we use it to setup as the scale.
Operators need to monitor the robot’s movement. Therefore, we move the robot
model posture in accordance with the real robot posture. The robot model is set at
the center of the hemispherical model. The construction robot consists of two arms,
a two-layered body, and a track. The robot sends the angles of its two arms and a
relative angle between its upper and lower body, and its lower body and track. The
posture of the 3D robot model is decided using these sent angles. We confirmed that
the 3D robot model posture synchronizes with the sent angles. The robot can rotate
two arms through 360 degrees. The robot can synchronize the real robot’s posture
with the 3D robot model posture (Figs. 5.51 and 5.52).
Fig. 5.51 Real robot posture
Fig. 5.52 3D robot model

posture
5.5.1.2 Display of Robot’s Heading Direction
It is difficult to grasp the robot direction in arbitrary viewpoint because the viewpoint
moves. Therefore, we display the heading direction in the bird-eye view image. We
measure the direction using encoders and the IMU in the robot. The encoders measure
the relative angle between upper body and lower body, lower body and track.
The blue line shows 0◦ of the upper body (Fig. 5.53). The red line shows 90◦ of
the upper body. The upper left doll shows yaw angle of IMU.
5.5.1.3 Image Interpolation for Camera Malfunction [23]
Cameras may break in disaster areas. If a camera breaks, blind spots are increased.
Therefore, we interpolate the missing images using the other camera images. Because
we integrate the fish-eye cameras images, the field of view of each camera overlaps
each other. We prevent overlap by dividing the shared areas among cameras. We
integrate the images to minimize the gap between textures at the boundary of each
camera region (Fig. 5.54). Because the combination of used camera differs due to the
fault pattern, the boundary also differs due to the fault pattern. When the robot uses
4 cameras, the possible failure patterns are 16 (Fig. 5.55). Because interpolation due
to camera malfunction needs the external parameters for all patterns, we calculate
Fig. 5.53 Displaying robot direction using blue and red lines
Fig. 5.54 Interpolation process using remaining cameras
(a) Normal condition (b) 1 camera is broken (c) 2 cameras are broken
Fig. 5.55 Camera broken patterns
in advance. As shown in Fig. 5.56a, when the lower left image is missing, we can
interpolate it using the proposed system (Fig. 5.56b).
(a) Lack of image (b) Interpolation by another camera

image
Fig. 5.56 Interpolation of camera malfunction
5.5.2 Visible and LWIR Camera System [44]
We need to obtain a visual information for the visual information feedback. A visible
camera, or an RGB camera, is usually used to obtain the visual information. However,
it is very challenging to obtain the visual information in a dark night scene and/or a
very foggy scene. For such condition, a longwave infraread (LWIR) camera is very
effective to obtain the information, especially detecting humans and/or animals.
The fog is opaque for the visible light, but it is transparent for the LWIR. The key
difference of the visible light and the LWIR is the wavelength. The wavelength of
the visible light is from 400 [nm] to 700 [nm], while the LWIR camera can detect
the radiation of about 10 [µm] wavelength. This range is similar to the range of
intensity peak radiated from −80C to 90C according to Wien’s displacement law.
Then, the LWIR camera is also known as thermal camera. Size of typical fog particles
is roughly 10 [µm]. The fog is opaque for the visible light, because the wavelength
of the visible light is shorter than size of the fog particles. The wavelength of the
LWIR is comparable to the size of the fog particles. Therefore, the LWIR camera
can observe objects through the fog.
Even in the dark night, a human itself radiates the LWIR. It means that the LWIR
camera can observe the human without an additional lighting. It is also one of advan-
tages of the LWIR camera.
As mentioned, the LWIR camera has good properties for the foggy and/or night
scenes. However, the resolution of the LWIR camera is usually very lower compared
to that of the visible camera. In addition, the LWIR camera can not capture the
visible texture information. For those reasons, the image fusion algorithm for the
visible and LWIR image pair is highly demanded. For the image fusion, the precise
Fig. 5.57 Process pipeline of the proposed geometric calibration
image alignment is required. The image registration between cross modal image
pair like visible and LWIR image pair is very challenging problem. For the precise
image alignment for the visible and LWIR image pair, we have developed an accurate
geometric calibration system and a coaxial visible and LWIR camera system.
5.5.2.1 Accurate Geometric Calibration of Visible LWIR Camera
The overview of the proposed calibration flow is shown in Fig. 5.57. The proposed
calibration consists of the five steps: (1) calibration for a visible camera, (2) tone
mapping for a LWIR image, (3) calibration for the LWIR camera, (4) extrinsic param-
eter estimation, and (5) image alignment. In the proposed system, we first capture
the visible and the LWIR images simultaneously which include the corresponding
checker pattern by the proposed calibration target as shown in Fig. 5.57a, b. Then, the
tone mapping is applied to the captured LWIR image to enhance the captured checker
pattern. The intrinsic parameters of the visible and the LWIR cameras are estimated.
Next, the extrinsic camera parameters between both cameras are estimated. Finally,
the both images are aligned using the estimated intrinsic and extrinsic camera param-
eter as shown in Fig. 5.57c.
The proposed two-layer calibration target is shown in Fig. 5.58a. The proposed
target consists of the two layers: (1) the black and planner base board and (2) the
white planner plates. Here, the white plates which are held up by poles as shown in
Fig. 5.58b. The black base board is made of resin whose thermal infrared emissivity
is very small, while the white planner plates is made of aluminum. The surfaces of
the plates are painted by resin whose thermal emissivity is large. The LWIR images
which contain the proposed calibration target is shown in Fig. 5.57b. The region of
the base board is dark due to the small emissivity of the base board, while the region
in the planner plates are bright (i.e. high temperature) originated from the thermal
radiation from the resin on the plates.
The low thermal diffusion structure of the proposed two-layer calibration target
is also very effective to persist the clear checker pattern for a long time (>15 min.).
Note that this persistence is critical to stably capture the checker pattern in the
LWIR camera. The persistence also leads to dramatically reduce the required time
for obtaining the set of images for calibration. As shown in Fig. 5.57a, b, we can
(a) front view (b) side view
Fig. 5.58 Proposed two-layer calibration target
Fig. 5.59 Time evolution for the LWIR images
obtain the corresponding checker pattern from the visible and the LWIR images
simultaneously, which is necessary for the joint calibration for the visible and the
LWIR cameras.
We can simultaneously obtain the clear checker pattern for a long time by the
proposed calibration target. This strength is critical to stably extract the correspond-
ing points from the LWIR images, because we can incorporate existing sophisticated
implementations and the useful tools of the visible image to estimate accurate cor-
responding points for the LWIR images.
An example of the captured checker pattern for the existing and the proposed
system are shown in Fig. 5.59. Here, the top and the bottom row show the checker
pattern by the existing and the proposed system. Each column shows the time series
variation of the checker pattern just after heating. As shown in Fig. 5.59, the checker
pattern by the proposed target is much clearer than that of the existing target over all
times. Furthermore, the clear checker pattern by the proposed target can be preserved
after 600 [sec], while the checker pattern by the existing target is quickly diminished.
To evaluate quantitatively the stability of the extracted corresponding points, we
measured the mean square error between the extracted point coordinates at initial
time and that of each time. The time series variation of the error is shown in Fig. 5.60.
Here, the smaller pixel error is, the more stable the extracted corresponding points
Fig. 5.60 Stability

comparisons of extracted
corner points
is. As shown in Fig. 5.60, the errors by the proposed target is smaller than 0.1 pixel
even after 800 [sec], while the error by pixel after 60 [sec]. The results show the
stability of the extracted corresponding points by the proposed target compared with
the existing one.
We captured the 60 image pairs for the accuracy evaluations. Among them, we
used 30 image pairs to calibrate both cameras. The other 30 images were used for
evaluation. To evaluate the performance, we measured the mean reprojection error
(MRE), i.e. the residual between the extracted point coordinate of each image and the
transformed point coordinate from the world coordinate into each image coordinate
using the estimated camera parameters by the proposed and the existing system. The
MRE of the proposed calibration system was 0.139 pixel for the visible image and
0.0676 pixel for the LWIR image, while those of the existing calibration system was
2.360 pixel for the visible image and 0.1841 pixel for the LWIR image.
5.5.2.2 Coaxial Visible and LWIR Camera System [35]
In the previous section, we have developed the software calibration system for the
visible and LWIR camera pair. Here, we discuss a hardware alignment for the visible
and LWIR camera pair. We developed the coaxial visible and LWIR camera system.
Figure 5.61 shows the inside of our developed visible and LWIR coaxial cam-
era system. A beam splitter made of silicon divides light entered through an optical
window into visible and LWIR rays. Silicon has properties of high transmittance in
LWIR wavelength and high reflectance in visible wavelength. An optical window
for visible cameras is usually made from glass because the glass has high transmit-
tance in visible wavelength. However, the glass has very low transmittance in LWIR
wavelength. Therefore, no one can use the glass for the optical window of the coax-
ial visible and FIR camera system. Suitable material which has high transmittance
in both visible and LWIR wavelength is not known. However, we need to protect
cameras and dust sensitive beam splitter from rain water and dust, especially outdoor
Fig. 5.61 Inside of the

coaxial visible and LWIR
camera system
Fig. 5.62 The camera

system with the optical
window of the thin film
use. The optical window is very important to improve dust resistance of the camera
system.
For the proposed camera system, we used a plastic wrap for food which is thin film
made of polyethylene as shown in Fig. 5.62. The LWIR camera of our system can
observe the range of wavelength around 10 [µm]. By considering those wavelength,
we have empirically found that a plastic wrap whose thickness is close to LWIR
wavelength, that is 10 [µm], is suitable for the optical window of the visible and
LWIR camera system. This simple idea greatly improve the dust resistance of the
camera system.
The mechanical accuracy of the hardware alignment is limited. Then, the software
calibration in the previous section is also incorporated. Figure 5.63 shows example
images of the developed coaxial visible and LWIR camera system. Figure 5.63c, d
are alpha-blending results of the visible and the LWIR images. One can observe the
misalignment on the Fig. 5.63c which is without the calibration. It shows the limi-
tation of the mechanical alignment. After we applied the calibration in the previous
(a) Visible image (b) LWIR image (c) Alpha (d) Alpha
blending without blending with the
the calibration calibration
Fig. 5.63 Visible image, LWIR image, and alpha blending results with/without the calibration
Fig. 5.64 Foggy

environment emulation
section, we can get alpha-blending result as shown in Fig. 5.63d. This alpha-blending
result demonstrate the high-accuracy of the image alignment.
5.5.2.3 Foggy Environment Simulation System [35]
We also build the evaluation system to simulate the foggy environment. The diorama
is covered with an acrylic case as shown in Fig. 5.64. A fog machine and a fan are
installed to generate and diffuse the fog.
(a) Visible image for (b) LWIR image for

clear scene clear scene
(c) Visible image for (d) LWIR image for

foggy scene foggy scene
Fig. 5.65 Visible and LWIR images for clear and foggy scenes
Figure 5.65 is the capture results by the developed coaxial visible and LWIR
camera system for clear and foggy scenes. Those results demonstrate that we can
observe environment through the fog.
References
1. Araki, R., Okada, T., Tazaki, Y., Yokokohji, Y., Yoshinada, H., Nakamura, S., Kurashiki, K.:
External force estimation of a hydraulically-driven robot in the disaster area for high fidelity
teleoperation. In: 2018 JSME Conference on Robotics and Mechatronics (ROBOMECH 2018),
2A1-J05 (2018). (In Japanese)
2. Bensmaïa, S., Hollins, M.: Pacinian representations of fine surface texture. Percept. Psy-
chophys. 67(5), 842–854 (2005)
3. Bolanowski Jr., S.J., Gescheider, G.A., Verrillo, R.T., Checkosky, C.M.: Four channels mediate
the mechanical aspects of touch. J. Acoust. Soc. Am. 84(5), 1680–1694 (1988)
4. Bry, A., Bachrach, A., Roy, N.: State estimation for aggressive flight in GPS-denied envi-
ronments using onboard sensing. In: 2012 IEEE International Conference on Robotics and
Automation, pp. 1–8 (2012)
5. Chayama, K., Fujioka, A., Kawashima, K., Yamamoto, H., Nitta, Y., Ueki, C., Yamashita,
A., Asama, H.: Technology of unmanned construction system in Japan. J. Robot. Mechatron.
26(4), 403–417 (2014)
6. Choi, S.Y., Choi, B.H., Jeong, S.Y., Gu, B.W., Yoo, S.J., Rim, C.T.: Tethered aerial robots
using contactless power systems for extended mission time and range. In: 2014 IEEE Energy
Conversion Congress and Exposition (ECCE), pp. 912–916 (2014)
7. Culbertson, H., Unwin, J., Kuchenbecker, K.J.: Modeling and rendering realistic textures from
unconstrained tool-surface interactions. IEEE Trans. Haptics 7(3), 381–393 (2014)
8. CyPhy: The future of high-powered commercial drones. https://www.cyphyworks.com/
products/parc/. Accessed 8 Aug 2018
9. del Sol, E., et al.: External force estimation for teleoperation based on proprioceptive sensors.
Int. J. Adv. Robot. Syst. 11(52) (2014)
10. DJI: Phantom4. https://www.dji.com/jp/phantom-4. Accessed 8 Aug 2018
11. Egawa, E., Kawamura, K., Ikuta, M., Eguchi, T.: Use of construction machinery in earthquake
recovery work. Hitachi Rev. 62(2), 136–141 (2013)
12. He, S., Ye, J., Li, Z., Li, S., Wu, G., Wu, H.: A momentum-based collision detection algorithm
for industrial robots. In: 2015 IEEE International Conference on Robotics and Biomimetics
(ROBIO), pp. 1253–1259 (2015)
13. Heuer, H., Owens, A.: Vertical gaze direction and the resting posture of the eyes. Perception
18(3), 363–377 (1989)
14. Higashi, K., Okamoto, S., Yamada, Y.: What is the hardness perceived by tapping? In: Interna-
tional Conference on Human Haptic Sensing and Touch Enabled Computer Applications, pp.
3–12. Springer, Berlin (2016)
15. Hirabayashi, T., et al.: Teleoperation of construction machines with haptic information for
underwater application. Autom. Constr. 15(5), 563–570 (2006)
16. Hughes, C., Denny, P., Glavin, M., Jones, E.: Equidistant fish-eye calibration and rectification
by vanishing point extraction. IEEE Trans. Pattern Anal. Mach. Intell. 32(12), 2289–2296
(2010)
17. Ide, T., Nabae, H., Hirota, Y., Yamamoto, A., Suzumori, K.: Preliminary test results of hydraulic
tough multi finger robot hand. In: 2018 JSME Conference on Robotics and Mechatronics
(ROBOMECH 2018), 2P1-L01 (2018). (In Japanese)
18. Iwataki, S., Fujii, H., Moro, A., Yamashita, A., Asama, H., Yoshinada, H.: Visualization of
the surrounding environment and operational part in a 3DCG model for the teleoperation of
construction machines. In: 2015 IEEE/SICE International Symposium on System Integration
(SII), pp. 81–87 (2015)
19. James, C.A., Bednarz, T.P., Haustein, K., Alem, L., Caris, C., Castleden, A.: Tele-operation of a
mobile mining robot using a panoramic display: an exploration of operators sense of presence.
In: 2011 IEEE International Conference on Automation Science and Engineering, pp. 279–284
(2011)
20. Kiribayashi, S., Yakushigawa, K., Nagatani, K.: Design and development of tether-powered
multirotor micro unmanned aerial vehicle system for remote-controlled construction machine.
In: Preprints of the 11th International Conference on Field and Service Robotics, p. # 24 (2017)
21. Kiribayashi, S., Yakushigawa, K., Nagatani, K.: Design and development of tether-powered
multirotor micro unmanned aerial vehicle system for remote-controlled construction machine.
In: Proceedings of Field and Service Robotics, pp. 637–648 (2018)
22. Kiribayashi, S., Yakushigawa, K., Nagatani, K.: Position estimation of tethered micro
unmanned aerial vehicle by observing the slack tether. In: 2017 IEEE International Symposium
on Safety, Security and Rescue Robotics (SSRR), pp. 159–165 (2017)
23. Komatsu, R., Fujii, H., Kono, H., Tamura, Y., Yamashita, A., Asama, H.: Bird’s-eye view
image generation with camera malfunction in irradiation environment. In: 6th International
Conference on Advanced Mechatronics (ICAM 2015), pp. 177–178 (2015)
24. Kondo, D., Nakamura, S., Kurashiki, K., Yoshinada, H.: Immersive display for remote control
of construction robot. In: Proceedings of the 62nd Annual Conference of System, Control and
Information Engineers (ISCIE), pp. 131–135 (2018). (In Japanese)
25. Kontz, M.E., et al.: Pressure based exogenous force estimation. In: 2006 ASME International
Mechanical Engineering Congress and Exposition, pp. 111–120 (2006)
26. Kuchenbecker, K.J., Fiene, J., Niemeyer, G.: Improving contact realism through event-based
haptic feedback. IEEE Trans. Vis. Comput. Graph. 12(2), 219–230 (2006)
27. Lamore, P., Muijser, H., Keemink, C.: Envelope detection of amplitude-modulated high-
frequency sinusoidal signals by skin mechanoreceptors. J. Acoust. Soc. Am. 79(4), 1082–1085
(1986)
28. Luh, J., Zheng, Y.F.: Computation of input generalized forces for robots with closed kinematic
chain mechanisms. IEEE J. Robot. Autom. 1(2), 95–103 (1985)
29. Makino, Y., Maeno, T., Shinoda, H.: Perceptual characteristic of multi-spectral vibrations
beyond the human perceivable frequency range. In: 2011 IEEE World Haptics Conference
(WHC), pp. 439–443. IEEE (2011)
30. McMahan, W., Kuchenbecker, K.J.: Spectral subtraction of robot motion noise for improved
event detection in tactile acceleration signals. In: International Conference on Human Haptic
Sensing and Touch Enabled Computer Applications, pp. 326–337. Springer (2012)
31. McMahan, W., Gewirtz, J., Standish, D., Martin, P., Kunkel, J.A., Lilavois, M., Wedmid, A.,
Lee, D.I., Kuchenbecker, K.J.: Tool contact acceleration feedback for telerobotic surgery. IEEE
Trans. Haptics 4(3), 210–220 (2011)
32. Minamoto, M., Nakayama, K., Aokage, H., Sako, S.: Development of a Tele-earthwork system.
Autom. Robot. Constr. XI 269–275 (1994)
33. Moteki, M., Nishiyama, A., Yuta, S., Ando, H., Ito, S., Fujino, K.: Work efficiency evaluation on
the various remote control of the unmanned construction. In: 15th Symposium on Construction
Robotics in Japan, pp. O–21 (2015). (In Japanese)
34. Nagano, H., Takenouchi, H., Konyo, M., Tadokoro, S.: Haptic transmission using high-
frequency vibration generated on body of construction robot -performance evaluation of haptic
transmission system under construction robot teleoperation-. In: 2018 JSME Conference on
Robotics and Mechatronics (ROBOMECH 2018), 2A1-J04 (2018). (In Japanese)
35. Ogino, Y., Shibata, T., Tanaka, M., Okutomi, M.: Coaxial visible and FIR camera system with
accurate geometric calibration. In: SPIE Defense + Commercial Sensing (DCS 2017), vol.
10214, pp. 1021415–1–6 (2017)
36. Okamoto, S., Yamada, Y.: An objective index that substitutes for subjective quality of vibrotac-
tile material-like textures. In: 2011 IEEE/RSJ International Conference on Intelligent Robots
and Systems (IROS), pp. 3060–3067. IEEE (2011)
37. Okamura, A.M., Dennerlein, J.T., Howe, R.D.: Vibration feedback models for virtual envi-
ronments. In: Proceedings, 1998 IEEE International Conference on Robotics and Automation,
vol. 1, pp. 674–679. IEEE (1998)
38. Okura, F., Ueda, Y., Sato, T., Yokoya, N.: Teleoperation of mobile robots by generating aug-
mented free-viewpoint images. In: 2013 IEEE/RSJ International Conference on Intelligent
Robots and Systems, pp. 665–671 (2013)
39. Padgaonkar, A.J., et al.: Measurement of angular acceleration of a rigid body using linear
accelerometers. ASME J. Appl. Mech. 42(3), 552–556 (1975)
40. Papachristos, C., Tzes, A.: The power-tethered UAV-UGV team: a collaborative strategy for
navigation in partially-mapped environments. In: 22nd Mediterranean Conference on Control
and Automation, pp. 1153–1158 (2014)
41. Sato, T., Moro, A., Sugahara, A., Tasaki, T., Yamashita, A., Asama, H.: Spatio-temporal bird’s-
eye view images using multiple fish-eye cameras. In: Proceedings of the 2013 IEEE/SICE
International Symposium on System Integration, pp. 753–758 (2013)
42. Scaramuzza, D., Martinelli, A., Siegwart, R.: A toolbox for easily calibrating omnidirectional
cameras. In: 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp.
5695–5701 (2006)
43. Shen, S., Michael, N., Kumar, V.: Autonomous multi-floor indoor navigation with a computa-
tionally constrained MAV. In: 2011 IEEE International Conference on Robotics and Automa-
tion, pp. 20–25 (2011)
44. Shibata, T., Tanaka, M., Okutomi, M.: Accurate joint geometric camera calibration of visible
and far-infrared cameras. In: IS&T International Symposium on Electronic Imaging (EI 2017)
(2017)
45. Stephan, W., Davide, S., Roland, S.: Monocular-SLAM- based navigation for autonomous
micro helicopters in gps-denied environments. J. Field Robot. 28(6), 854–874 (2011)
46. Sun, W., Iwataki, S., Komatsu, R., Fujii, H., Yamashita, A., Asama, H.: Simultaneous tele-
visualization of construction machine and environment using body mounted cameras. In: 2016
IEEE International Conference on Robotics and Biomimetics (ROBIO 2016), pp. 382–387
(2016). https://doi.org/10.1109/ROBIO.2016.7866352
47. Tachi, S.: Telexistence. J. Robot. Soc. Jpn. 33(4), 215–221 (2015). (In Japanese)
48. Tian, Y., Chen, Z., Jia, T., Wang, A., Li, L.: Sensorless collision detection and contact force
estimation for collaborative robots based on torque observer. In: 2016 IEEE International
Conference on Robotics and Biomimetics (ROBIO), pp. 946–951 (2016)
49. Yamaguchi, T., et al.: A survey on the man-machine interface in remote operation. In: 59th
Annual Conference of Japan Society of Civil Engineers, vol. 59, pp. 373–374 (2004)
50. Yamauchi, T., Okamoto, S., Konyo, M., Hidaka, Y., Maeno, T., Tadokoro, S.: Real-time remote
transmission of multiple tactile properties through master-slave robot system. In: 2010 IEEE
International Conference on Robotics and Automation (ICRA), pp. 1753–1760. IEEE (2010)
51. Yoshikawa, T.: Foundation of Robotics. MIT Press, Cambridge (1990)
52. Zhang, Z.: Flexible camera calibration by viewing a plane from unknown orientations. In: 7th
International Conference on Computer Vision (ICCV 1999), vol. 1, pp. 666–673 (1999)
Part III
Preparedness for Disaster
Chapter 6
Development of Tough Snake Robot
Systems
Fumitoshi Matsuno, Tetsushi Kamegawa, Wei Qi, Tatsuya Takemori,

Motoyasu Tanaka, Mizuki Nakajima, Kenjiro Tadakuma, Masahiro Fujita,
Yosuke Suzuki, Katsutoshi Itoyama, Hiroshi G. Okuno, Yoshiaki Bando,
Tomofumi Fujiwara and Satoshi Tadokoro
Abstract In the Tough Snake Robot Systems Group, a snake robot without wheels
(nonwheeled-type snake robot) and a snake robot with active wheels (wheeled snake
robot) have been developed. The main target applications of these snake robots are
exploration of complex plant structures, such as the interior and exterior of pipes,
debris, and even ladders, and the inspection of narrow spaces within buildings, e.g.,
roof spaces and underfloor spaces, which would enable plant patrol and inspection.
At the head of each robot, a compact and lightweight gripper is mounted to allow
F. Matsuno (B) · T. Takemori · T. Fujiwara

Kyoto University, Kyodaikatsura, Nishikyo-ku, Kyoto 615-8540, Japan
e-mail: matsuno.fumitoshi.8n@kyoto-u.ac.jp
T. Takemori
e-mail: tattatatakemori@gmail.com
T. Fujiwara
e-mail: fujiwara.tomofumi.6w@kyoto-u.ac.jp
T. Kamegawa · W. Qi
Okayama University, 3-1-1 Tsushimanaka, Kita-ku, Okayama-shi, Okayama
700-8530, Japan
e-mail: kamegawa@okayama-u.ac.jp
W. Qi
e-mail: qi.w.mif@s.okayama-u.ac.jp
M. Tanaka · M. Nakajima
The University of Electro-Communications, Chofugaoka 1-5-1, Chofu, Tokyo
182-8585, Japan
e-mail: mtanaka@uec.ac.jp
M. Nakajima
e-mail: mizuki.nakajima@rc.mce.uec.ac.jp
K. Tadakuma · M. Fujita
Tohoku University, 6-6-01 Aramaki Aza Aoba, Aoba-ku, Sendai-shi,
980-8579 Miyagi, Japan
e-mail: tadakuma@rm.is.tohoku.ac.jp
M. Fujita
e-mail: fujita.masahiro@rm.is.tohoku.ac.jp
Y. Suzuki
Kanazawa University, Kakuma-machi, Kanazawa, Ishikawa 920-1192, Japan
e-mail: suzuki@se.kanazawa-u.ac.jp
https://doi.org/10.1007/978-3-030-05321-5_6
268 F. Matsuno et al.
the robot to grasp various types of objects, including fragile objects. To measure
the contact force of each robot, a whole-body tactile sensor has been developed.
A sound-based online localization method for use with the in-pipe snake robot has
also been developed. To enable teleoperation of platform robots with the sensing
system and the gripper, a human interface has also been developed. The results of
some experimental demonstrations of the developed tough snake robot systems are
presented.
6.1 Overview of Tough Snake Robot Systems
Robotic systems are strongly expected to be used for information gathering and to
contribute to the response to frequent natural and man-made disasters. In Japan,
the market for disaster response robots is very limited, so consideration of dual
emergency and daily use applications is one solution that may help to accelerate
the development of disaster response robots. In the ImPACT Tough Robot Challenge
(TRC) project, we consider the following three innovation areas. In terms of technical
innovation, we aim to create tough fundamental technologies that are effective for
use in extreme disaster situations. In social innovation terms, we aim to contribute to
preparedness, response and recovery capabilities. For industrial innovation, we aim
to propagate the technology to provide innovation in industrial fields. In the ImPACT-
TRC project, we have developed tough snake robot systems that can be used not only
in emergency situations such as disaster scenarios but also can be suitable for daily
use applications such as plant inspection and maintenance.
A snake robot is expected to be able to perform a wide variety of tasks while
having a simple structure, and the design and control methods used for snake robots
have been studied extensively. There are two possible approaches to the study of
snake robots. The first is the constructive approach that aims to understand the prin-
ciples of the nature of a living snake [1, 16], e.g., to solve the question of “How can
a living snake move without any legs?”. The second is the bio-inspired approach, in
K. Itoyama
Tokyo Institute of Technology, 2-12-1 Ookayama, Meguro-ku, Tokyo
152-8552, Japan
e-mail: itoyama@ra.sc.e.titech.ac.jp
H. G. Okuno
Waseda University, 3F, 2-4-12 Okubo, Shinjuku-ku, Tokyo 169-0072, Japan
Y. Bando
National Institute of Advanced Industrial Science and Technology (AIST),
2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Japan
e-mail: y.bando@aist.go.jp
S. Tadokoro
Tohoku University, 6-6-01 Aramaki-Aza-Aoba, Aoba-ku, Sendai 980-8579, Japan
6 Development of Tough Snake Robot Systems 269
which the aim is to create an artificial snake robot to perform specific tasks and not
to imitate a living snake. For example, this approach can be used to realize a solution
for victim searching under the debris of collapsed buildings in disaster scenarios.
Snake robots have been developed by many researchers, including [22, 41, 57], but
the snakes in these studies do not have dustproofing and waterproofing functions and
are not applicable to real search missions. In the development of the tough snake
robot systems, we consider the following innovations. From a technical innovation
perspective, we aim to develop tough snake robot platforms that are able to reach
places where legged robots and thin serpentine robots are unable to enter, that have
sufficient robustness and adaptability to survive in unknown environments, and that
have tough control systems with fault tolerance, failure prediction and failure recov-
ery functions. From the social innovation viewpoint, we aim to contribute not only
to daily inspection and maintenance of plant infrastructures but also to information
gathering processes for disaster preparedness, response and recovery efforts to real-
ize a safe and secure society. From the industrial innovation perspective, we aim to
contribute to reducing the workloads of operators in inspection and maintenance of
plant infrastructures by applying the tough snake robots to these tasks while also
creating a new business based on inspection/maintenance robots.
In the Tough Snake Robot Systems Group, we have developed two types of
snake robots for platform applications. The first is a snake robot without wheels
(nonwheeled-type snake robot) developed by the Kamegawa group from Okayama
University and the Matsuno group from Kyoto University (see Sect. 6.2). This robot
is both waterproof and dustproof. The main target application of the nonwheeled-
type snake robot is the exploration of complex plant structures, including the interior
and exterior of pipes, debris and even ladders. The second is a snake robot that
has active wheels (wheeled snake robot) developed by the Tanaka group from The
University of Electro-Communications (see Sect. 6.3). One of the wheeled snake
robot’s intended applications is the inspection of narrow spaces in buildings, e.g.,
the roof space and underfloor spaces. Another intended application is plant patrol
and inspection. Because these snake robots are expected to be applicable not only
to inspection tasks but also to maintenance tasks for daily use and to light work for
rescue and recovery missions in disaster scenarios, a jamming layered membrane
gripper mechanism that allows the robot to grasp differently shaped objects without
applying excessive pushing force was developed by the Tadakuma group of Tohoku
University (see Sect. 6.4). The gripper is mounted at the tip of the snake robot and
can accomplish specific tasks, including opening and closing a broken valve and
picking up fragile objects.
Snake robots have long and thin shapes and have multiple degrees of freedom.
Information about contact point locations, the forces at contact points, and the dis-
tance between the robot’s body and the environment is very important for robot
control. To enable full body sensing, a whole-body tactile sensor has been developed
by the Suzuki group of Kanazawa University (see Sect. 6.5).
For pipe inspection applications, the in-pipe robots need an online localization
function for autonomous inspection. A sound-based online localization method for
an in-pipe snake robot with an inertial measurement unit (IMU) has been devel-
Fig. 6.1 Snake robot platforms
oped by the Okuno group from Waseda University, the Itoyama group from Tokyo
Institute of Technology, and the Bando group from National Institute of Advanced
Industrial Science and Technology (AIST). The absolute distance between the robot
and the entrance of a pipeline can be estimated by measuring the time of flight (ToF)
of a sound emitted by a loudspeaker located at the pipe entrance. The developed
method estimates the robot’s location and generates a pipeline map by combining
the ToF information with the orientation estimated using the IMU (see Sect. 6.6). To
enable teleoperation of the platform robots with the sensing system and the gripper, a
human interface has been developed by the Matsuno group from Kyoto University to
maintain situational awareness and to reduce the load on the operator (see Sect. 6.7).
The allocation of the roles in the development of the tough snake robot systems
is shown in Fig. 6.1. The fundamental technologies, including the control strategies
used for the nonwheeled-type snake robot and the wheeled snake robot, the jam-
ming layered membrane gripper mechanism, the whole-body tactile sensor, and the
sound-based online localization method for the in-pipe snake robot, are all original.
We designed the entire human interface system by taking the specifications of each
of these fundamental technologies into consideration. The novelties of the developed
tough snake robot systems include the robot system integration and the applicability
of these robots to real missions, including inspection and search and rescue opera-
tions, which require the robot to have both high mobility and a manipulation function
to perform light work such as opening or closing a broken valve and picking up small
fragile objects.
Fig. 6.2 Nonwheeeled-type snake robots developed in ImPACT-TRC
6.2 Nonwheeled-Type Snake Robot
Until now, various type of snake robots have been researched and developed [16,
30]. A nonwheeled type snake robot has a simple configuration in which the links are
connected serially using joints and there are no wheels on this robot. We also devel-
oped several nonwheeled-type snake robots. Our nonwheeeled-type snake robots
developed in ImPACT-TRC are shown in Fig. 6.2. Because the link lengths can be
shortened by omitting the wheels from the configuration, a more detailed robot shape
can be realized and high adaptability to different environmental shapes is expected.
In addition, this robot has the advantage that waterproofing and dustproofing can be
achieved easily when compared with the case of a snake robot with wheel mechanisms
by simply attaching a cover. However, because this robot does not have an efficient
wheel-based propulsion mechanism, it is necessary to control the form of the entire
body to move the robot. The main target application of the nonwheeled-type snake
robot is exploration of complex plant structures, such as the interior and exterior of
pipes, debris fields and ladders. In this section, we describe the control method used
for the nonwheeled-type snake robot and the demonstration experiments performed
using the snake robot that we have developed.
6.2.1 Control Method
(1) Shape Fitting Using the Backbone Curve: We use a snake robot model that is
composed of alternately connected pitch-axis and yaw-axis joints. All links have the
same length l, and the number of joints is n joint .
In our study, a method to approximate the discrete snake robot to a continuous
spatial curve, called the backbone curve, is used. This method makes it possible to
consider a snake robot in an abstract manner as a continuous curve, which means
that it then becomes easy to design a complex form. Yamada et al. modeled the form
Fig. 6.3 Frenet–Serret

frame and backbone curve
frame [47]
of the snake robot using the Frenet–Serret formulas [60], and proposed a method
to derive suitable joint angles based on the curvature and the torsion of the target
curve [61]. We used Yamada’s method [61] here to calculate the joint angles required
to approximate the snake robot to a target form.
In Fig. 6.3, e1 (s), e2 (s), and e3 (s) are the unit vectors that are used to configure the
Frenet–Serret frame, which is dependent on the form of the curve, and s is the length
variable along the curve. In addition, to model the snake robot, it is also necessary to
consider the joint directions. As shown in Fig. 6.3, the backbone curve reference set
composed of er (s), e p (s), and e y (s) is defined based on the orientation of each part
of the robot, which can be regarded as a continuous curve. The twist angle between
e2 (s) and e p (s) around e1 (s) is denoted by ψ(s), which can be obtained as follows:
s
ψ(s) = τ (ŝ)dŝ + ψ(0), (6.1)
0
where ψ(0) is an arbitrary value that corresponds to the initial angle. By varying ψ(0),
the entire backbone curve reference set rotates around the curve and a rolling motion
is thus generated. Here, we let κ(s) and τ (s) be the curvature and the torsion in the
Frenet–Serret formulas, respectively, while κ p (s) and κ y (s) are the curvatures around
the pitch axis and the yaw axis in the backbone curve reference set, respectively. These
quantities are obtained as follows:
κ p = −κ(s) sin ψ(s), κ y = κ(s) cos ψ(s). (6.2)
Finally, the target angle for each joint is calculated as shown below:
⎧ sh +(i+1)l
⎪
⎪ κ p (s)ds (i : odd)
⎨
θi = shs+(i−1)l
d
h +(i+1)l
, (6.3)
⎪
⎪
⎩ κ y (s)ds (i : even)
sh +(i−1)l
where sh is the position of the snake robot’s head on the curve. Changing sh smoothly
allows the robot to change its form along the target curve.
(2) Backbone Curve Connecting Simple Shapes [47]: It is difficult to provide
an analytical representation of the complex target form of a snake robot. There is
also another problem in that the torsion sometimes becomes infinite if there is a zero
curvature region in the curve [60]. To solve these problems, we proposed a method for
target form representation based on connection of simple shapes. Using this method,
we can design the target form intuitively because the geometric properties of simple
shapes are clear. In addition, the corresponding joint angle can be obtained easily
because the curvature and the torsion are known. The shapes that are connected using
this method are called segments. We will explain here how the target form can be
represented by connecting simple shapes.
The approximation method can be applied with no problems for the internal parts
of the segments. Therefore, the curvature of the target form κ(s) and the torsion τ (s)
can be obtained as follows:
κ(s) = κ j (s j−1 < s ≤ s j ) (6.4)

τ (s) = τ j (s j−1 < s ≤ s j ), (6.5)
where κ j and τ j are the curvature and the torsion of segment j, respectively. However,
because the Frenet–Serret frame is discontinuous at the connection part, i.e., where
the segments are connected, it is thus necessary to devise a suitable representation.
Let ψ̂ j be the twist angle at the connection part that connects segments j and ( j + 1),
where this angle is one of the design parameters. To consider this twist angle in the
calculation of the approximation method, (6.1) must be replaced by
s
ψ(s) = τ (ŝ)dŝ + ψ(0) + ψ̂ j u(s − s j ), (6.6)
0 j
where u(s) is the step function, which has values of 0 if s < 0 and 1 if s ≥ 0. The
snake robot’s joint angles can therefore be obtained using (6.2)–(6.6). To design a
specific target form, we must decide the required shape of each segment and the twist
angle ψ̂ j . By varying ψ̂ j , we can change the target form as if the snake robot has a
virtual roll axis joint at the connection part of the target form.
Any shape can be used as long as the curvature and the torsion are known. A
straight line, a circular arc, and a helix are the simplest shapes with constant curvature
and torsion.
(3) Gait Design Based on Connection of Simple Shapes: We designed three novel
gaits for a snake robot using the proposed gait design method by connecting simple
shapes. The forms of these three gaits are shown in Fig. 6.4. The shapes of all segments
in each gait can be determined intuitively by determining several gait parameters.
The outlines of each of the gaits are described below.
Gait for moving over a flange on pipe [47]
Using this gait, the snake robot can climb over a flange on a pipe, even in the case
of a vertical pipe. The target form is shown in Fig. 6.4a. It is also possible for a
snake robot to propel itself along a pipe using a helical rolling process. In this
gait, the bridge part is provided in the middle of a helix to allow the snake robot
to stride over a flange. By combining the shift control with the rolling motion, it
is possible for the snake robot to move over a flange.
Fig. 6.4 Gaits for a snake robot that were designed by connecting simple shapes
Gait for Ladder Climbing [48]

Use of this gait allows the snake robot to climb a ladder with shift control. The
target form is shown in Fig. 6.4b. On the ladder, where there are few contact points
between the snake robot and its environment, this special motion is required to
allow the robot to climb without falling. In addition, even when this motion is
applied to a tethered snake robot, the tether cable will not become caught in the
ladder.
Crawler gait [47]
This gait has high adaptability to uneven ground because the whole of the snake
robot body behaves like a crawler belt in a manner similar to the loop gait described
in [37, 65]. The target form is shown in Fig. 6.4c. Unlike the loop gait, this gait does
not require a special mechanism to connect the two ends of the robot. Furthermore,
the crawler gait offers greater stability because several parts of the snake robot are
grounded. The shift control and the rolling motion generate propulsion in the
front-to-back direction and the side direction, respectively. Therefore, the robot is
capable of omnidirectional movement. In addition, turning and recovery motions
in the event of a fall are also realized.
(4) Gait Design Based on Helical Rolling Motion: Snake robots that imitate bio-
logical snakes have been developed. However, snake robots can achieve a helical
rolling motion that cannot usually be observed as a biological snake motion. We
have focused on helical rolling motion for the movement of the snake robot [2, 23,
40]. This motion can be expected to be effective in cylindrical environments because
a snake robot forms a spiral shape in the helical rolling motion. For example, it
can be expected to allow flexible correspondence to more complex environments by
using the redundancy in the snake robot’s degrees of freedom, unlike conventional
and simple wheeled-type pipe inspection robots. In this section, the helical rolling
motion and the improvement of this motion to a helical wave propagation motion is
described.
Fig. 6.5 Example of

generated helical shape (with
a = 0.2, b = 0.05)
The helical shape is expressed in Cartesian coordinates as follows.

⎧
⎨ x(t) = a cos(t)
y(t) = a sin(t) (6.7)
⎩
z(t) = b(t)
where a denotes the helical radius, b is the increasing ratio of the helical shape in the
z direction, and t is a parameter. An example of a helical shape that is generated using
Eq. (6.7) is shown in Fig. 6.5. After the helical shape is designed in the Cartesian
coordinate system, the target shape is then converted to a curve in the Frenet–Serret
equation. The curvature and the torsion of this curve are then obtained from the
geometrical conditions given by the following equations:
√
( ẏ z̈−ż ÿ)2 +(ż ẍ−ẋ z̈)2 +(ẋ ÿ− ẏ ẍ)2
κ(t) = 3 (6.8)
(ẋ 2 + ẏ 2 +ż 2 ) 2
x (3) ( ẏ z̈−ż ÿ)+y (3) (ż ẍ+ẋ z̈)+z (3) (ẋ ÿ+ ẏ ẍ)
τ (t) = ( ẏ z̈−ż ÿ)2 +(ż ẍ−ẋ z̈)2 +(ẋ ÿ− ẏ ẍ)2
(6.9)
where, ẋ, ẍ and x (3) denote first-, second-, and third-order differentiation with respect
to t, respectively. By varying the values of a and b in Eq. (6.7), the radius and pitch
of the helical shape can be designed as required. Furthermore, changing the value of
ψ(0) causes the continuum robot’s coordinate system to be rolled around er (s) such
that the robot generates a lateral rolling motion.
The helical rolling motion is available when the snake robot moves inside a pipe.
However, in the case where the robot is moving on the exterior of the pipe, this
motion is not always used when the robot passes across a branch point on the pipe
because the helical rolling motion only causes motion towards the binormal direction
of the snake robot’s body. To address this issue, we consider making the snake robot
move in a tangential direction along its body rather than use the movement in the
binormal direction produced by the helical rolling motion. We call this new motion
helical wave propagation motion.
First, it is assumed that a snake robot is in a state where it is wrapped around
a pipe. Therefore, the snake robot creates a longitudinal wave to make parts of its
trunk float away from the pipe. These floating parts are transmitted from the tail to
the head of the snake robot via the shifting method [40]. In this study, the helical
wave curve can be expressed in Cartesian coordinates as follows:
⎧
⎨ x(t) = a(t) cos(t)
y(t) = a(t) sin(t) (6.10)
⎩
z(t) = b(t)
where,
a(t) = ρ(t) + r
(6.11)
b(t) = 2π
n
t
ρ(t) = Asech(ωt − φ)
(6.12)
{A ∈ R | A > 0}, {ω ∈ R | ω > 0}, φ ∈ R
where, r is the radius of the helical curve, n is the pitch in the z-axis direction of the
helical curve, and t is a parameter. It is very important to determine an appropriate
a(t) to add an appropriate wave to a conventional helical curve. We designed a(t) as
a hyperbolic function of sech added to the radius r . In Eq. (6.12), A is the amplitude,
ω is the curve width, and φ is the initial phase. We can design the hyperbolic function
by varying these parameters.
6.2.2 Robot Specifications
(1) Smooth-Type Snake Robot: An overall view of the smooth-type snake robot is
shown in Fig. 6.6, and the system configuration and the details of the modules are
shown in Fig. 6.7. Table 6.1 lists the robot’s parameters. All joints have a range of
motion of 180 deg. The Dynamixel XH430-V350-R (ROBOTIS Co., Ltd.) is used
as the joint actuator. It is possible to attach a sponge rubber cover over the robot, as
shown on the left side of Fig. 6.6. The snake robot is powered using a power cable
and the target angle for each joint is sent from a computer located on the operator
side through an RS485 interface. It is thus possible to obtain information about the
robot, including the joint angles and the motor current required. The operator can
control the robot’s operations via a gamepad. The software systems run on the Robot
Operating System (ROS).
Fig. 6.6 Overall view of smooth-type snake robot without cover (left) and with cover (right)
Fig. 6.7 System configuration (left) and configuration details (right) of smooth-type snake
robot [48]
Table 6.1 Parameters of the smooth-type snake robot

Parameter Value
Number of joints 36
Link length l 70 mm
Total size (H × W × L) 56 × 56 × 2520 mm
Maximum joint angle ± 90◦
Maximum joint torque 4.0 Nm
Total mass 5.5 kg
Fig. 6.8 Snake robot platform (high-power type) in ImPACT-TRC
Fig. 6.9 System diagram of the high-power-type snake robot platform
It is important for the exterior body surface of the snake robot to be smooth to
prevent it from becoming stuck in the environment. We therefore developed a snake
robot with a smooth exterior body surface by forming the links with the pectinate
shape that is shown in Fig. 6.7. This pectinate shape does not affect the bending of
the joints. The exterior surface also protects the cable. In addition to covering the
cable, the exterior surface guides the cable to pass on the joint axis and the load that
is acting on the cable because of the bending of the joint is thus reduced.
Table 6.2 Parameters of the high-power type snake robot

Parameter Value
Number of joints 20
Link length l 80 mm
Total size (H × W × L) 115×115×2000 mm
Maximum joint angle ±90◦
Maximum joint torque 8.4 Nm
Total mass 9.0 kg
(2) High-Power-Type Snake Robot:

In this section, a high-power-type snake robot, which we have developed as part
of this project, is shown in Fig. 6.8 and its system structure is illustrated in Fig. 6.9.
Table 6.2 lists the robot’s parameters.
Like the smooth-type snake robot, the high-power-type snake robot is constructed
by connecting the pitch axis and the yaw axis alternately to achieve three-dimensional
motion. The robot contains a total of 20 joints. To drive these joints, Dynamixel MX-
106 (ROBOTIS Co., Ltd.) servo motors were adopted. The robot is approximately
2.0 m in length and weighs approximately 9.0 kg. A ring-shaped sponge rubber tube
is attached to the outer periphery of the robot to increase the frictional force between
the robot and the surrounding environment. The width of the sponge rubber ring is
25 mm, its outer diameter is 115 mm, and it is 15 mm thick. The personal computers
(PCs) and power supplies that control the robot are installed externally, and these
devices are connected to the robot via a wired cable. The cable is approximately 10 m
in length. To eliminate cable twisting during the helical rolling motion, a rotary630
rotary connector made by Solton Co. is mounted at the rear part of the robot. At the
head of the robot, a GoPro HERO 4 Session camera is installed for monitoring of the
inside of the piping. As the light-emitting diode (LED) light source, an LED Light
(She IngKa) is used. The camera image is transferred via wireless communication
and is displayed on the terminal to be observed by the operator. A commercially
available servo motor is also adopted as a joint part. One microcomputer is installed
for every two servo motors, and these microcomputers are connected to the controller
area network (CAN) bus for communication. The SEED MS1A (THK) was used as
the microcomputer. The robot and the external PC communicate with each other via
a Universal Serial Bus (USB)-CAN converter, which was independently constructed.
In addition, a microcomputer equipped with an IMU sensor was mounted near the
top of the robot. Ubuntu is installed as the operating system on the external PC used
for robot control, and the program was developed using ROS.
In addition to the system described above, we have also implemented advanced
systems in the high-power-type snake robot, as shown in Fig. 6.10.
Several research examples have been reported in which touch sensors and pres-
sure sensors are attached to the trunk of a snake robot to enable recognition of the
environment and provide feedback to modify the motion of the robot. However,
Fig. 6.10 Center of pressure (CoP) sensors and acoustic sensor implemented in the snake robot
Fig. 6.11 Snake robot with dustproofing and waterproofing
there have been no previously reported studies on snake robots with sensors that can
measure the contact pressure around the entire circumference of a snake robot that
performs three-dimensional motion. In this project, a center of pressure (CoP) sen-
sor is mounted on the high-power-type snake robot to measure the contact pressure
around the robot. The CoP sensor is connected to the microcomputer, which reads
the output data.
When the snake robot is applied to in-pipe inspection, it is very useful to know
where the snake robot is located inside the pipe because it can then record camera
image information in association with the position information provided by the robot
in the pipe. To achieve this, we realized the required functionality by installing an
acoustic sensor, which was developed as part of this project, in the high-power snake
robot. Through use of the acoustic sensor and the IMU sensor, we also developed
localization and mapping software in this project.
All information obtained from the robot is integrated and displayed on the user
interface, which shows the operator a stabilized camera image, the robot’s config-
uration and various sensor data with computer-generated (CG) graphics, including
the location of the robot in the pipe.
We have also fabricated the high-power-type snake to dustproof and waterproof
specifications, as shown in Fig. 6.11. By placing the whole robot within a sponge
rubber tube, dustproofing and waterproofing can be realized relatively easily.
6.2.3 Demonstration Experiment
(1) Pipe and Duct Inspection: The snake robot that is introduced in this section was
demonstrated in the ImPACT-TRC evaluation field. Figure 6.12 shows the appearance
Fig. 6.12 Pipes in the ImPACT-TRC test field and the snake robot in the pipe
of the test piping field used for evaluation and the state of the snake robot when moving
inside the piping. The inner diameter of the piping is 200 mm. The piping structure
is composed of a straight horizontal pipe of approximately 2 m in length, which it
is connected via an elbow pipe to a vertical straight pipe that is approximately 4 m
long and is then connected via another elbow pipe to a straight pipe of about 0.5 m
in length. The full length of the structure is approximately 7 m. There is a gate valve
in the middle of the route that was fully opened during this experiment.
In the demonstration experiments, the snake robot was inserted at the lower piping
entrance and ran through the piping structure above using the helical rolling motion
before reaching the outlet in the upper piping. In addition, we also confirmed that the
information from the sensor mounted on the snake robot could be obtained correctly.
A simulated instrument was placed in front of the piping exit, and the operator was
then able to observe the instrument visually using the image from the camera at the
head of the robot.
When the snake robot runs through the curved pipe portion, appropriate control of
the shape of the snake robot is necessary. To allow the robot to run through the elbow
pipe, we implemented the algorithm to produce the curved helical rolling motion and
applied a method to adjust the servo rigidity of the joints.
Fig. 6.13 Duct structure in the ImPACT-TRC test field
In addition to the experiments in the piping, we also conducted experiments in

which the robot was running through ducts. The appearance of the duct is as shown
on the left of Fig. 6.13. In addition, a drawing of the duct is shown on the right
of Fig. 6.13. The duct has cross-sectional dimensions of 250 mm × 250 mm, and
the robot is inserted from the lower entrance. First, the robot must pass through
a horizontal section, then a vertical section, a horizontal section, another vertical
section, and a final horizontal section to reach the top outlet. The total length from
entrance to exit is approximately 4 m.
In the experimental runs through the ducts, the snake robot is controlled using the
same program that was used in the piping-based experiments. The snake robot was
inserted at the entrance in the lower duct and ran through the above duct structure
using the helical rolling motion to reach the outlet in the upper part. We also confirmed
that the sensor data, including that from the CoP sensor, was available when the robot
was running in the duct.
When using the helical rolling motion, the snake robot can not only move inside
the piping but can also move on the exterior of the piping. However, it is necessary
to improve the robot’s motion to overcome a branch point and a flange on the pipe
exterior. Experimental results for the new motion process developed in this project
are shown in the following.
First, an example of the experimental results when the snake robot passed through
a branch point on the pipe is shown in Fig. 6.14. The experiments demonstrated that
the snake robot can pass through the branch part using the proposed helical wave
propagation motion. In addition, we found that the helical wave propagation motion
Fig. 6.14 Helical wave curve motion to overcome branch point located on the exterior of a pipe
generates the opposite motion to the helical rolling motion in the circumferential
direction. The helical wave propagation motion can be used to unwind the cable that
extends from the tail portion of the snake robot, which was wound up as a result of
the helical rolling motion.
We performed an experiment in which the snake robot climbed over a flange on a
vertical pipe, a horizontal pipe, and a pipe that was inclined at 45 degrees. Another
type of snake robot that we developed was also used in this demonstration. The snake
robot has sponge rubber wrapped around its links to allow it to grip the pipe. The outer
diameter of the pipe was approximately 110 mm, the outer diameter of the flange was
210 mm, and the flange thickness was 44 mm. In the experiments, the operator looked
directly at the snake robot while operating it. The snake robot was able to move over
the flanges on all the pipes. Figure 6.15 shows the snake robot when climbing over
the flange from below. The motion over the flange proceeded semi-automatically
by performing shift control and rolling actions. In the experiments, when slippage
occurred between the snake robot and the pipe and the position relative to the flange
shifted, the relative position between the flange and the snake robot was adjusted via
a rolling motion by the operator’s command.
Fig. 6.15 Demonstration of snake robot moving over a flange on a pipe [47]
Fig. 6.16 Demonstration of smooth-type snake robot climbing ladder [48]
(2) Movement in Other Environments: The ladder climbing experiments were

performed using the smooth-type snake robot. The interval between steps on the
ladder was 250 mm. The experimental results of climbing of a vertical ladder are
shown in Fig. 6.16. If the ladder shape is known, the operator only has to execute the
shift control process. Because its exterior body surface was smooth, the snake robot
was able to slide over the steps without becoming stuck and was thus able to climb
the target ladders successfully. Note that the cable connected to the snake robot did
not become caught in the ladder.
Fig. 6.17 Demonstration of movement on a debris field with smooth-type snake robot
Another experiment was conducted in which the snake robot moved across a debris
field using a crawler gait (Fig. 6.17). This debris field was constructed by randomly
laying out concrete pieces, metal plates and pieces of rebar. The experiment was
performed using the smooth-type snake robot, which was covered using the sponge
rubber tube to protect it against dust. The operator commanded the snake robot to
move forward with the shift control, to move sideways by rolling, or to turn. Because
of the high ground tolerance of the crawler gait, the robot was able to move across
the debris field adaptively.
6.3 Wheeled Snake Robot T 2 Snake − 3
We developed the snake-like articulated mobile robot with wheels T2 Snake-3 [54]
shown in Fig. 6.18. One of the robot’s intended applications is the inspection of
narrow spaces in buildings; e.g., the roof space and underfloor. Another intended
application is plant patrol inspection. When inspecting the inside of a building and
plant, a robot needs to move narrow spaces, overcome obstacles (e.g., pipes), and
climb stairs. Moreover, it needs to not only move but also perform operations, e.g.,
opening a door and rotating a valve. The developed robot has the following features:
• Entering narrow spaces by using its thin body.
• Climbing high obstacles (maximum height of 1 m) by using its long body.
• Semiautonomous stair climbing by using data collected by sensors mounted onto
the bottom of its body.
• Operating equipment by using a gripper mechanism mounted onto its head; e.g.,
rotating a valve, picking up an object, and opening a small door.
Compared to nonwheeled-type snake robots, the wheeled snake robot T2 Snake-3 has
high performance in step climbing, stair climbing, and operations. This subsection
introduces the mechanism, control methods, and performance of the T2 Snake-3
robot.
Table 6.3 lists the robot’s parameters. The robot uses the same joint configuration
used in smooth and high-power type snake robots as Sect. 6.2. Additionally, wheels
are mounted onto the left and right sides coaxially with respect to the pitch joint, as
Fig. 6.18 Overall view of articulated mobile robot T 2 Snake − 3 [54]
Table 6.3 Parameters of T 2 Snake − 3 [54]

Parameter Value
Number of yaw joints 9
Number of pitch joints 8
Link length l 90.5 mm
Total size (H × W × L) 120×150×1729 mm
Wheel radius r 50 mm
Maximum joint angle (Pitch) ±113◦
Maximum joint angle (Yaw) ±65◦
Total mass 9.2 kg
Battery life about 80 min.
shown in Fig. 6.19. This configuration was used in the ACM-R4.2 [28], and allows
the robot to locomote by performing various types of motion such as moving-obstacle
avoidance [49, 53], whole-body collision avoidance [52], and step climbing [27, 51].
The robot changes the posture of its body three-dimensionally by rotating its joints,
and moves forward or backward by rotating its active wheels. On each wheel axle,
one wheel is active with an actuator, while the other wheel is passive without an
actuator. A battery is mounted inside the passive wheel. Therefore, because many
batteries are mounted onto the robot’s body, we reduced the overall size of the robot
and elongated the battery life.
The robot is wireless and controlled remotely. A camera is mounted onto its
head and tail, respectively. Moreover, range sensors and proximity sensors [14] (see
Sect. 6.5) are located at the bottom of the body. The range sensor measures the
distance between the body and the obstacle, and the proximity sensor measures the
inclination angle between the robot and the surrounding plane. The camera images,
body posture of the robot, and sensor information are displayed on a laptop computer
for the operator’s reference. Thus, this information assists the robot’s operator in
Active wheel
Yaw joint Passive wheel Proximity sensor
Distance sensor
Pitch joint
Passive wheel Active wheel
Fig. 6.19 Enlarged view (left) and bottom view (right) of T 2 Snake − 3 [54]
USB Head
Wi-Fi Edison board
camera
Ethernet USB Tail
Edison board
camera
Laptop for
operation
Joystick Router Onboard control PC Robot
Fig. 6.20 Complete T 2 Snake − 3 system [54]
understanding the robot’s surroundings. Moreover, the robot uses the sensor data
when performing semiautonomous stair climbing.
Figure 6.20 shows the complete robot system. The laptop computer used to operate
the robot, and the robot itself, are connected to the same local area network, which
is managed by a network router. First, an operator uses a joystick to issue commands
to the robot. Next, the laptop computer receives the commands through the Robot
Operating System (ROS) and sends them to the onboard control computer (PC).
Finally, the onboard PC controls the robot by calculating the control input based on
the commands and control algorithm. Additionally, the onboard PC sends the robot’s
information (e.g., joint angle and sensor data) to the laptop computer through the
local network. Then, the laptop displays the posture of the robot and information
regarding the robot’s surroundings based on the received information.
With regard to equipment inspection, the robot cannot only collect information
through cameras, but can also operate equipment; e.g., open a door, rotate a valve
and cock, and push a button. To operate equipment, the robot needs an end-effector
to make contact with an object. Thus, the robot is equipped with the Omni-Gripper
mechanism [12] on its head, which functions as an end-effector. The details of the
gripper are explained in Sect. 6.4. Figure 6.21 shows the robot with the Omni-Gripper
mechanism. If the weight of the head carrying the gripper is large, the robot will not
be able to lift its head up high, owing to the joint torque limit. However, because the
Fig. 6.21 T2 Snake-3 with Omni-Gripper [12]
weight of the Omni-Gripper is light, the robot can operate equipment by lifting its
head, as shown in Fig. 6.21.
The robot has many control modes. This subsection outlines the three main control
methods of the robot, and the performance of the robot, when the robot uses these
control methods.
(1) Three-dimensional steering method using the backbone curve.

(2) Semiautonomous stair climbing.
(3) Trajectory tracking of the controlled point.
(1) Three-dimensional steering method using backbone curve: We used shape

fitting that utilizes the backbone curve proposed in [61, 62] to perform the basic
three-dimensional steering of the robot. The backbone curve is a continuous curve,
which expresses the target shape of the robot body. The operator sends the head
motion command (forward/backward/up/down/left/right) to the robot, which makes
a continuous target curve by considering the operator’s command. Thus, it calculates
the joint angles by the shape fitting method using the curve, and realizes three-
dimensional motion by shifting the head’s motion from the head to the tail, as shown
in Fig. 6.22. Moreover, the robot can recover from overturn by performing a lateral
rolling motion around the longitudinal direction of the backbone curve. Table 6.4
shows the basic mobility properties of the robot when using this control method, and
Fig. 6.23 shows photographs of the experiment. From Table 6.4, it can be understood
that the robot exhibits high performance when climbing obstacles. Specifically, the
robot can climb a 1 m high step with a riser. Therefore, the robot can both climb a
high-step and enter a narrow space.
If the robot uses this steering method, it can adapt the surrounding terrain by
the terrain following method proposed by [54]. In this method, the operator uses a
joystick to set the torque at some of its joints to zero. Thereby, the robot adapts to the
surrounding terrain. Then, the entire continuous backbone curve is recalculated by
considering the current angle, and the three-dimensional steering motion is resumed
1 2 3
Fig. 6.22 Three-dimensional steering
Table 6.4 Basic mobility properties of T 2 Snake − 3 [54]

Parameter Value
Propulsion speed 250 mm/s
Maximum trench width 550 mm
Maximum step height without riser 675 mm
Maximum step height with riser 1030 mm
Minimum width of L-shaped path 250 mm
t = 0 [s] t = 16 [s] t = 32 [s]
t = 8.0 [s] t = 24 [s] t = 40 [s]
Fig. 6.23 Climbing step with riser (height 1030 mm) [54]
from the current posture. The robot was able to pass through a field of crossing
ramps [21] by using the terrain-following method, as shown in Fig. 6.24.
(2) Semiautonomous stair climbing: The robot can climb up and down stairs
semiautonomously, as reported by [54]. The operator provides the following three
control commands through a joystick.
(a) Propulsion velocity.
(b) Deciding whether or not a next step exists.
(c) Adjusting the estimated height of the next riser that the head is approaching.
1 3
2 4
Fig. 6.24 Passing through field of crossing ramps [54]
Fig. 6.25 Stair climbing robot and operator
The robot detects whether or not each wheel touches the ground by using the data
collected from the sensors located at the bottom of its body. Then, it autonomously
and appropriately rotates the pitch joints. Next, the robot climbs up or down the
stairs without collision occurring between the joint part of the robot and the stairs.
Moreover, the semiautonomous stair climbing operation is easy to perform because
the operator can control it by only using one hand, as shown in Fig. 6.25.
Figure 6.25 shows the robot climbing stairs, where the tread depth and riser height
were 300 and 200 mm, respectively. Figure 6.26 shows the robot climbing down steps
composed of treads and risers with various depths and heights. The robot was able
to climb up and down the stairs without getting stuck, even if the tread and riser
sizes of each step were different. Figure 6.27 shows the robot climbing up a steep
staircase with a slope angle of 54.5◦ . The robot was able to climb the steep staircase;
however, the parameters of the stairs shown in Fig. 6.27 were not compliant with the
ISO 14122-3 standard [20].
(3) Trajectory tracking control of the head: If the abovementioned three-
dimensional steering method is used, the robot will not be able to move its head
to the side direction while maintaining the orientation of the head. Additionally, the
robot will not be able to change the orientation of its head while maintaining the posi-
t = 0 [s] t = 32 [s] t = 63 [s]
t = 16 [s] t = 47 [s] t = 79 [s]
Fig. 6.26 Snapshot of robot climbing down stairs [54]
t = 0 [s] t = 27 [s] t = 54 [s]
t = 14 [s] t = 41 [s] t = 68 [s]
Fig. 6.27 Climbing up steep stairs [54]
tion of its head. Thus, the robot cannot control both its position and orientation in
three-dimensional space by using the three-dimensional steering method. Therefore,
a control method that accomplishes the trajectory tracking of the controlled point
(e.g., the head of the robot or the tip of the gripper) was implemented in the robot. In
this method, which is modified based on [50, 53], the operator provides the position
and orientation target value of the controlled point (three translational parameters and
three rotational parameters) through a joystick. Then, the robot calculates the control
input to accomplish the target motion of the controlled point and rotates its joints and
wheels. Therefore, the robot can accomplish the trajectory tracking of the controlled
point in two-dimensional or three-dimensional space, as shown in Fig. 6.28. This
control method is used in operations utilizing the gripper located on the robot’s head.
Controlled point
Controlled point (b)

(a)
Fig. 6.28 Trajectory tracking control of the head in a two-dimensional space and b three-
dimensional space
6.4 Jamming Layered Membrane Gripper Mechanism for

Grasping Differently Shaped Objects Without
Excessive Pushing Force for Search and Rescue
Missions
6.4.1 Abstract
A gripper comprising a jamming membrane was developed with the capability of

grasping collapsible, soft, and fragile objects without applying heavy pressure. In
disaster sites, it is necessary for robots to grab various objects, including fragile ones.
Deformable grippers that contain bags filled with powder cannot handle collapsible
or soft objects without excessive pressure. Changing powder density relatively by
changing the inner volume is one approach to overcome this problem. By expand-
ing the concept and simplifying the variable-inner-volume gripping mechanism, we
developed a jamming membrane comprising the following three layers: outer and
inner layers made of rubber and a powder layer between them. This jamming mem-
brane allows for collapsible, soft, or fragile objects to be held securely without
applying excessive pressure. We designed and developed a prototype of the jamming
membrane gripper. Our experiments validated the proposed jamming membrane
mechanism.
6.4.2 Introduction
In this study, prototype grippers for grabbing objects with different shapes by apply-
ing a low pushing force were designed and tested. Nowadays, many grippers that can
adapt their shapes to grab objects are being developed, including connected rigid-
body grippers [7, 9, 10, 17, 33], grippers with functional fluids [6, 11, 32, 38, 68],
grippers using malleable bags [18, 19, 39, 71], and grippers that utilize the jamming
transition phenomenon of granular powders [3]. A problem with conventional jam-
ming grippers that are filled with powder is that because they need to push forcibly
against the object to grip it securely, a fragile object could break under pressure. To
solve this problem, we devise a mechanism that applies the variable-inner-volume
principle to reduce the pushing force.
In other words, the amount of powder can be changed relative to the volume
inside the bag. By using only a small amount of powder filling, the gripper can adapt
its shape to grip the object with a low pushing force. However, a partially filled
bag could collapse when the inside of the bag is vacuumed. This collapse can be
prevented by increasing the amount of powder to fill the space inside the bag. Thus,
this variable-inner-volume concept is applied in the development of a three- layer
jamming membrane gripper that allows objects to be held with a low pushing force.
This gripper can be installed on mobile robots and used for gripping objects and
pressing buttons. An example of the apparatus is shown in Fig. 6.29, where the
proposed gripper is installed on a snake-like robot and, directed toward opening a
switchboard door. In this manner, the proposed gripper can be installed on a powerless
mobile robot as well. Existing jamming grippers require a high pushing force to
deform the gripper into an approximate shape of the object to be picked up. Therefore,
it is much difficult to integrate the gripper into a snake-like robot, as shown in
Fig. 6.29, because it does not have enough power to use the gripper. Therefore,
it is can be effective in disaster situations. In this study, we describe a three-layer
membrane structure and its principle, examine a prototype gripper with a three-layer
membrane structure, and investigate a method for equipping it on robots.
6.4.3 Jamming Membrane
A prototype of the gripper comprising a three-layer membrane structure is shown

in Fig. 6.30. The three-layer membrane jamming gripper contains coffee powder
between the outer and inner membranes. The gripper includes a suction port to
initiate jamming and a discharge port to pressurize the inside of the inner membrane.
In addition, inserting compressed air into the inside of the inner membrane expands
the inner membrane. Then, the powder density changes to a high-density state because
of the pressure exerted from the expanded inner membrane. Thus, the powder density
can be changed relative to inner volume through the compressed air pressure to be
supplied. The adaptability to objects is enhanced by pressurization, which avoids
buckling. The operation of the gripper is described as follows: (1) air is released
from inside of the inner membrane, which is brought to the atmosphere, (2) the
gripper is pushed against the target object, (3) once the gripper adapts itself to the
object shape, the blank space inside the inner membrane is removed by pressurizing
the inside of the inner membrane, and (4) the objects are gripped by jamming after
vacuuming the powder membrane.
Gripper
Mechanism
Snake-Like Robot
Door of
a Switchboard
Fig. 6.29 Illustration of a switchboard opened by a snake-like robot with a jamming membrane
gripper
Rigid Arm
Outer Membrane AA-Battery
Scale
Fig. 6.30 Overall view of jamming layered membrane gripper

6.4.4 Design and Experiments of Gripper with Jamming

Membrane
6.4.4.1 Required Specifications
In order to be embedded in mobile objects, the three-layer membrane gripper was

designed to fulfill the following requirements: the size of the gripper and the force
needed to deform its membrane were set based on the snake-like robot.
(i) The size of the gripper can be adjusted to mobile robots. In this study, we tested
the performance of the gripper with a non-dimensionalized value in order to
show the general performance of the gripper with the proposed mechanical
configurations.
(ii) The weight should be below 800 g (including the weights of the vacuum pump
(Cinh PBW12) and, control board).
(iii) The area to be embedded should have a width of 135 mm, depth of 83 mm, and
variable a height (including the gripper and, pump).
(iv) The body of the gripper should be such that it enables the gripper to push the
button, grip the handle, and open the door.
(v) The holding force should be over 10 N because a snake-like robot can only lift
an object up to approximately 1 kgf. In addition, the performance of the gripper
should not degrade when used in various states.
6.4.4.2 Specific Configuration
The dimensions of the different components of the gripper should meet the following
requirements. The outer membrane should have a diameter of 50 mm, the inner
membrane should have a diameter of 38 mm, the length of the gripper should be
60 mm, and the gripper tip radius should be 10 mm (for pushing a button). The air
between the outer and inner membranes was vacuumed through a groove in the cup.
The CAD drawing of the gripper is shown in Fig. 6.31. The thickness of the rubber
membrane is 3 mm considering the accuracy of modeling, malleability, and strength.
The membrane is fixed by wires and not with an adhesive so that it can be replaced.
6.4.4.3 Characterization of the Gripper’s Pushing Force Based on the

Amount of Powder Inside the Soft Bag
The gripping performance of the gripper depends on various factors such as the
size of the powder particles, materials, pressure, and fill factor (amount of powder
in the bag). To determine the optimal fill factor, we compare the performance of
the four gripper configurations, as shown in Fig. 6.32. In addition, we compare the
performance of the gripper in the following four cases: three grippers with 10, 20,
and 30 g of powder added between the outer and inner membranes, and a previous
Fig. 6.31 3D CAD drawing Vacuum

of the gripper
Air
Groove
for vacuuming
Tube
Cap
Membrane
Particles
Fig. 6.32 Cross-sections of four gripper configurations
type of jamming gripper that is completely filled with powder (52 g of powder). In
Fig. 6.32, the amount of powder filling (R1 ) is normalized by the following equation.
Curr ent amount o f the powder (10, 20, 30, and 52 g)

R1 = (6.13)
Amount o f the powder f illed (52 g)
An overview of the experimental setup is shown in Fig. 6.33. As the target pickup
object (load), we chose a cylinder with a diameter that was 80% that of the grip-
per (40/50 mm). The cylinder was made of MC nylon with a smooth surface. The
gripper vertically approached the object through a linear actuator with a stroke of
50 mm along the z-axis. A force sensor, which measured the gripper’s push force,

setup for measuring pushing
forces
Electric
Cylinder
Force
Gauge
Scale
Gripper
Target Object
AA-Battery
for Size Conparison
was set at the base of the gripper and was connected to the actuator. The z-depth
was measured using a linear potentiometer in the linear actuator unit. Each gripper
grabbed the object 10 times.
The experimental results is shown in Fig. 6.34. In the graphs shown in this figure,
the horizontal axis represents the minimum pressing amount, which is normalized
using the following equation:
Curr ent pr essing amount
R2 = (6.14)
T he maximum length o f the gri pper bag (100 mm)
The graph shows the relation between the pushing forces on the z-axis at each z
point. The vertical axis shows the corresponding pushing force.
These experiments indicated, that a large amount of powder filling corresponded
to a large standard deviation of the recorded forces. In addition, when the amount of
powder was large, the push force, on average, was high. However, the gripper with
10 g of powder in the interstitial bag could not grasp the object because of significant
Fig. 6.34 Experimental 25

results of the pushing
force(Amount of powder
First Second Third Fourth Fifth
filling: R1 = 0.38 (20 g)) 20
Sixth Seventh Eighth Ninth Tenth
Pushing Force[N]
15
10
0
0 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50
Pressing Amount, R 2
deformation (i.e., buckling). Meanwhile, the gripper with 20 g of powder could grasp
the object with a lower pushing force and smaller dispersion than grippers with 30
and 52 g of powder. A low push force (preload) is preferred to avoid damaging the
object, particularly objects that are delicate or fragile. While grasping an object,
a maximum preload of less than 5 N is desirable (pressure < 5 N/pi × 50 mm ×
50 mm). Therefore, we determined that 20 g was the optimal condition in a tradeoff
between a low preload force for handling delicate objects and a sufficiently large
load capacity for handling comparatively heavy objects.
The gripper with 20 g of powder had a holding force of 11.28 N, which was well
over the target holding force of 1 kgf. Therefore, at this stage, the amount of powder
inside the soft bag was set to 20 g.
6.4.4.4 Experiment with Gripping a Set of Keys Using the Gripper
The experiment involved picking up a set of keys from the floor using the same
gripper as that described in Fig. 6.35, offers a closeup view of the gripper holding a
set of keys. The experimental procedure is as follows:
(1) The gripper is malleable.
(2) The gripper is pushed against the keys.
(3) It grips them by stiffening, which is caused by vacuuming of the powder filling
layer.
(4) The keys are picked.
Rigid Arm
Gripper Mechanism
Key Bundle(Fragile Object)
Fig. 6.35 Back side of the gripper with key bundle
6.4.4.5 Experiments Using the Prototype to Perform Task on a

Switchboard
This experiment demonstrated that the gripper could push the button and open the
switchboard door using the handle as shown in Fig. 6.36. The sharpened tip enabled
the gripper to push the button. It is difficult for conventional jamming grippers that
are completely filled with powder to push the button because the tip is not sharpened.
In addition, it is difficult to grip the handle because it is pushed back when fitting
in the shape of the handle because of the high pushing force. This result indicates
that the three-layer membrane jamming gripper generates a lower pushing force than
conventional jamming grippers that are completely filled with powder.
6.4.5 Conclusion
A gripper was devised with a jamming-membrane structure based on the variable-

inner-volume principle. The three-layer structure comprised a powder layer between
two silicon rubber film layers. This structure reduced the amount of powder required
for grasping and could grasp objects with different shapes by applying a low pushing
force. The gripper could also be operated sideways as a result of the positioning of
the powder particles. To confirm the effectiveness of the proposed gripper, a prelim-
inary experiment was conducted to characterize the pushing and holding forces of
00:00 [s] 2 01:15 [s] 3 04:06 [s]

1
Plane handle
Button
Membrane
Gripper
Initial Rigid State Push
4 09:01 [s] 5 21:08 [s] 6 25:13 [s]
Soft State Grasp and Rigid State Pull
Fig. 6.36 Conditions of the experiment for opening and closing the switchboard
the gripper in different orientations (upward, downward, and sideways), in vacuum

and under atmospheric pressure, based on the amount of powder contained inside
the powder layer of the gripper. Two experiments, which involved opening a switch-
board door and picking up a set of keys, were conducted to evaluate the gripper’s
performance. The experimental results demonstrated that the proposed gripper could
push a button and secure a handle through variation induced in its internal pressure
and could pick up a set of non-uniformly shaped keys. In addition, as compared to
conventional jamming grippers, the proposed gripper could grip objects with less
pushing force and weighed less because the powder layer was only partially filled. In
future studies, membrane structures will be evaluated by modeling and the gripper
will be modified for installation on robots.
6.5 Whole-Body Tactile Sensor
In this section, we describe tactile sensors attached to the entire body of the snake
robot developed by Kamegawa et al. The snake robot, which has no active wheels,
drives several joints arranged in series to generate motion of the body, and generates
propulsive force using various parts of the body to push the ground, walls, and
other surrounding objects. However, in an unmodeled environment, the state of the
contact between the snake robot and the environment is unknown such that efficient
locomotion cannot be performed. Furthermore, in a narrow environment, which is
one of the specialized areas of operation for snake robots, the body shape is strongly
limited by the environment, and there is a risk of damage of the robot body due
to the reaction force from the environment during the motion generation process.
Tactile sensors arranged on the entire body of the snake robot can be used to address
these problems, as well as improve the efficiency and reduce the risk of damage by
detecting the contact states with the environment throughout the body.
Various types of tactile sensing methods have been proposed with utilizing resis-
tive [44, 59], capacitive [36], optical [63, 67], magnetic [56], and piezoelectric [69].
These sensors have shown contributions with providing action-related information,
control parameters in motion, and estimation of contact states in the field of robotics.
In many of the researches, tactile sensors were implemented only at specific points
or limited areas on the robots such as fingertips of robotic hands [24, 43]. Some
researches realized to mount tactile sensors on large areas of the robot bodies by
arranging a number of sensor elements in a dispersed manner [15] or covering a
robot with a stretchable sensor sheet [66]. Based on these researches, it is important
to implement tactile sensors that have detection functions that match the character-
istics of the robots and the purposes of using the tactile information.
The following are important points to be considered in the design of a tactile
sensor for a snake robot: First, the range of movement of the joints of the snake robot
should not be limited. The snake robot under consideration has 20 joints arranged
in series, and each has a movable range of ±90◦ or more. If the tactile sensor, its
circuit boards, and cables occupy a large volume, there will be interference when
the joint flexes significantly, thereby hindering the generation of motion. Therefore,
it is desirable that the structure of the tactile sensor itself is thin, the circuit board
is compact, and the number of cables is low. Second, the entire body surface of the
robot should be covered as much as possible using the tactile sensors. There are
various motions that use the contact of the entire body and there is a possibility that
unintended parts may come in contact with an unmodeled environment. Therefore,
it is desirable that the sensing area of the tactile sensor should be evenly spread in
all directions so that it can detect the environment in any situation.
Figure 6.37 shows the appearance of the developed tactile sensor. The specifica-
tions are as listed in Table 6.5.
6.5.1 Sensor Arrangement Design
Many of the considerations in the design of tactile sensors are derived from the
requirement that they should be mounted on parts directly in contact with the outside
environment. Accordingly, its design tends to affect the robot’s ability to generate
motions and perform tasks. Therefore, we describe the design of the developed tactile
sensor, paying careful attention to the structure that does not disrupt the original capa-
bilities possessed by the snake robot and the sensing performance that can acquire
data to improve the capabilities.
Fig. 6.37 Tactile sensor for the non-wheel high power type snake robot in Sect. 6.2. The sensor
was designed to cover the surface of the entire outer shell parts of the robot
Table 6.5 Specifications of the tactile sensor

Outputs Center position of pressure distribution on each link
Normal force and tangential force
Thickness 2.5 mm
Power supply DC 2.7 ∼ 5 V
Consumption current ∼10 mA at one sensor substrate
Communication I2 C serial
6.5.1.1 Arrangement of Sensing Area
Figure 6.38 shows a schematic diagram of the structure of the snake robot. Pitch joints
and yaw joints are alternately arranged in series at regular intervals. This structure was
formed by connecting the servomotors while twisting 90◦ . There are cylindrical outer
shells around the servomotors made of ABS resin as parts for generating interaction
of force in contact with the environment. Rubber rings covering the outer shells
absorb the error between the environment and the body shape, and these are softly
crushed and exert high frictional forces. The tactile sensors are mounted on the layer
between the outer shells and the rubber rings. That is, the tactile sensor has a thin
sheet-like structure and is wound around the outer shell surface. As a result, the
portion close to the surface layer can be significantly deformed and generates high
frictional force, while the force applied on the surface is transmitted to the tactile
sensor without leakage. Note that the force is dispersed when the rubber ring is thick;
thus, the thickness of the rubber should be reduced to detect the contact points with
high spatial resolution.
Fig. 6.38 Schematic

diagram of the structure of
the snake robot. A cylindrical
outer shell is attached in each
gap of the joints arranged in
series on the robot. The outer
shell is a part that is in
contact with the external
Rubber ring
environment, and is designed
such that it does not limit the
flexion angle of the joints. Outer shell
The tactile sensor needs to be
attached to cover the outer
shell surface thinly so as not
to interfere with the action of
the rubber ring and to receive
the forces applied to the
outer shell
The method of attaching the tactile sensor and the rubber ring to the outer shell
is one of the difficult challenges in the structural design. The requirements for this
are to prevent the tactile sensor from reacting at the unloaded state and to prevent
it from breaking due to the peeling force. The first requirement prohibits binding
through shrinkage of the rubber ring. In this method, the contraction force acts on
the tactile sensor, and it reacts, although the force is not received from outside. The
second requirement prohibits bonding of the rubber ring and the tactile sensor. The
tactile sensor, which is composed of thin flexible materials, is sufficiently strong that
it can be pressed but weak when it is pulled off. As shown in Fig. 6.39a, the problem
of breakage of the tactile sensor occurs when a bending load acts on the rubber
ring. Therefore, we implemented the joining method such that the tactile sensor was
attached by bonding with the outer shell surface, while the rubber ring was joined
to the frame extending from the outer shell as shown in Fig. 6.39b. The joint makes
several point constraints through small protrusions biting into the rubber ring from
the frame. When a compressive load acts on the rubber ring from outside, the rubber
ring deforms to compress the tactile sensor, and the load can be detected. On the
other hand, even if a bending load acts on the rubber ring, since the tensile load is
supported by the protrusions, the force is not sufficient to peel off the tactile sensor
and the bonding between the rubber ring and the outer shell can be maintained.
According to the above design of the sensing area, the tactile sensor appropriately
responds when the rubber is pressurized through contact with the external environ-
ment, and the structure can withstand loads in any direction.
(a-1)
(a-2) (a-3)
(b-1)
(b-2) (b-3)
Fig. 6.39 Methods of attaching the tactile sensor and the rubber ring to the outer shell. a First
model of the tactile sensor, which was bonded to the rubber ring and was frequently damaged when
a large bending load was applied during snake robot operation. b New model with the proposed
joint mechanism, which enables both reaction to the normal force and endurance to the bending
load
6.5.1.2 Arrangement of Electrical Circuit
In addition to the pressure-sensing part whose electrical characteristics change due to

the external force, there is an electrical circuit for measuring the response change of
the pressure-sensitive part in the tactile sensor; thus, space for installing the board is
required. As several ICs are mounted on the board, its installation position should be
flat and protected from external physical damage. In our design, we added a cavity to
accommodate the substrate inside the outer shell. There is also the merit that wiring
is easy because the distance to the pressure sensitive part on the outer shell surface
is short.
6.5.2 Sensing Specification Design
Tactile sensors for robots should be properly designed not only for the structure of
the robot but also to use the tactile sensor for robot operation. For example, the sen-
sor should have high spatial resolution if it is designed for precise measurement of
the environment shape, while it should be capable of swiftly providing useful data
for motion generation if the objective is to improve locomotive capacity and task
performance. The tactile sensor developed in the present study aims to improve the
capability to move in narrow and complicated spaces that are not visible from oper-
ators such as inside piping. Therefore, it was designed to have appropriate sensing
performance, especially in the usability of acquired information and its sampling
rate.
6.5.2.1 Required Specifications
We assumed that useful information for the operator when the snake robot moves in
a narrow space is as follows: whether the robot is in contact with the environment
at the appropriate force, and the direction the force acts. The former is intended to
avoid problems in which the servomotor is overloaded when the contact force is too
large and the necessary propulsive force cannot be exerted when the contact force
is too small. The latter is to ensure the shape of the robot is properly aligned with a
local change in the terrain such as a curved portion of the piping. On the other hand,
force distribution with high spatial resolution is not essential for controlling the robot
because in many cases contact with the environment occur only at one point in each
link of the robot. Rather, it takes long time to acquire large amount of data. To reduce
the control cycle of the robot and smoothen the operation, it is desirable to limit
information to bare minimum. Based on the above consideration, we implemented
a sensing method that acquires only the center position and the total amount of the
force distribution in the sensing area in each link in a short time.
6.5.2.2 Sensing Principle
The sensing method of the tactile sensor is based on the principle of center of pressure
(CoP) sensor [44]. The CoP sensor is composed of several pressure-sensitive elements
capable of covering a large area and an analog computing circuit that calculates
the center position and the total amount of the distribution of reaction amounts in
the pressure-sensitive elements at high speed. Compared to the method of digitally
acquiring the reaction of all pressure-sensitive elements, this method acquires only
useful information extracted by the analog computing circuit inside the sensor, and is
therefore superior in terms of speed and savings in wiring. Here, high speed means not
only decreasing the information acquisition time, but also the information processing
time for motion control, which usually requires a high processing load and should
be implemented by a microcomputer mounted on the robot.
6.5.2.3 Circuit Diagram
The circuit diagram of the CoP sensor is shown in Fig. 6.40. Here, m × n pressure
sensitive elements are arranged in a two-dimensional matrix. Each pressure sensitive
element is composed of two separate electrodes and a pressure conductive rubber to
bridge the electrodes. The pressure conductive rubber contains conductive particles
inside silicone rubber, a functional material whose electrical resistance changes by
+V0 A
[Vy+]
1 R0 r r
R0 A
2 R0 Pressure-
[Vx-] m [Vx+] V0 rubber

r
1 2 n y r
[Vy-]
R0 x
Fig. 6.40 Circuit diagram of the CoP sensor. The pressure-conductive rubber can be modeled as
a variable resistance. Both ends of the variable resistors are connected to the analog circuits to
calculate the center position of the flowing current in x and y directions
forming a conductive path according to the compressive force. The two electrodes
are connected to positive and negative power supplies, respectively via resistors
R0 . Here, the positive-side electrode is short-circuited in the row direction, while
resistors r are sandwiched in the column direction, and the negative-side electrode
is short-circuited in the column direction while resistors r are sandwiched in the row
direction. Using the electrical circuit, it is possible to calculate the center position
in the two-dimensional coordinate and the total amount of the current distribution
generated in the pressure-sensitive elements by measuring the potentials of only four
points Vx+ , Vx− , Vy+ , and Vy− with the following equations..

R0 2 Vx+ − Vx−
xc = 1 + (6.15)
r m−1 Vx+ + Vx−

R0 2 Vy+ − Vy−
yc = 1 + (6.16)
r n−1 Vy+ + Vy−

Vx+ + Vx− − 2V− 2V+ − Vy+ − Vy−
Iall = = (6.17)
R0 R0
In addition, a new sensing mechanism was introduced in the tactile sensor devel-
oped in the present study for the estimation of the normal force and the tangential
force by overlaying the CoP sensors [45]. The structure consists of two identical CoP
sensors with a thin flexible material sandwiched between them. Because the flexible
material undergoes shear deformation when tangential force is applied, the position
of one CoP sensor with respect to the other shifts in the tangential direction. Here,
the difference between the center positions of the normal force obtained from both
sensors correlates with the shear strain. Thus, it is possible to estimate the tangential
force from the difference.
(C)
(A)
(B)
(D)
(F) (E) (F)
Fig. 6.41 Fabricated flexible substrate of the tactile sensor
6.5.3 Fabrication and Implementation
6.5.3.1 Fabrication of Sensor Substrate
The fabricated flexible substrate of the tactile sensor is shown in Fig. 6.41. A single
substrate can cover half of the outer shell surface with a cylindrical shape (diameter
80 mm, width 25.2 ∼ 29 mm). The portions (A) and (B) in the figure are pressure
sensitive parts on the outer layer and the inner layer, respectively, and each has 78 ×
12 pressure sensitive elements. The chip resistors of the analog computing circuit of
the CoP sensors are mounted on the portions (C) and (D). The portion (E) indicates
an analog-to-digital (A/D) converter for measuring the output of the CoP sensors and
performing serial communication with the microcomputer on the robot. The portion
(F) is a terminal to connect with the sensor substrate on the opposite side of the outer
shell. The substrate is folded back twice at two constricted points so that the portion
including the A/D converter is inside the outer shell, the pressure sensitive part of the
inner layer is on the outer shell surface, and the pressure sensitive part of the outer
layer is outside of flexible material with a thickness of 2 mm.
The size of one pressure sensitive element, which was designed to be as small
as possible, is 1.6 mm. This is to increase the linearity of the output on the center
position of the CoP sensor by subdividing the pressure sensitive part. This high
linearity improves the calculation accuracy of the tangential force from the difference
of the center position output of the two layers of the CoP sensors. Here, the analog
computing portion of the resistance circuit is comprised of 92 chip resistors for
each CoP sensor: 77 internal resistors (rc 47 ) connecting the electrodes in the
circumferential direction, 11 internal resistors (rw 330 ) in the width direction, and
4 external resistors (1000 ) connecting the electrodes at the ends to the power
supplies (Vdd (= 3.3 V) and GND). Each resistance value was determined based on
the loading experiment using a prototype of the sensor.
6.5.3.2 Implementation on Snake Robot
We implemented the tactile sensor on all the 20 links of the non-wheel type snake
robot. Cables were connected from one microcomputer to two sensors as the micro-
computers that acquire sensor information were installed in each of the two links.
The cable includes only four lines: the sensor power supply line (VDD), the ground
line (GND), the serial clock line (SCK) and the serial data line (SDA) for I2 C com-
munication. We incorporated the cable without affecting the internal structure of
the robot. For the sensor power supply, which can be selected within the range of
2.7–5 V, we supplied DC 3.3 V in common with the microcomputer. In this case, the
consumption current was theoretically approximately 6.6 mA per sensor substrate
(maximum), which occurs when the maximum detectable force acts on the sensor.
Motion experiments were conducted several times in which the snake robot moved
inside the piping, including vertical parts and bent parts. The proposed tactile sensor
was able to produce stable output when in contact with the inner wall of the pipe
during the operation, and there was no damage to the sensor. This result indicates
that the developed tactile sensor can be mounted without restricting the operation
of the snake robot, can acquire data for motion generation when in contact with the
environment, and can withstand the applied load during operation.
6.6 Online Sound-Based Localization and Mapping
Online localization of in-pipe robots is essential for both autonomous pipe inspection
and remote operator support. Conventional localization methods that use encoders,
visual sensors, and inertial sensors [13, 25, 29, 31, 35, 64] raise cumulative error
problems and are vulnerable to sudden slips or other unintended movements of the
robot. Sensor systems such as Global Positioning System (GPS) sensors and magne-
tometers can determine absolute locations outdoors and in most indoor environments,
but pipeline environments disturb the required radio waves and magnetic fields [34].
While several methods simply use the length of the power cable between the robot
and the entrance to the pipeline [31, 35], the cable length is an unreliable measure
in large-diameter pipelines because the cable may curve or coil in the pipe.
Sound-based distance estimation can measure the shortest distance along the
pipeline from and the entrance to the pipeline to the robot. Placing a microphone
on the robot and a loudspeaker at the entrance enables the distance between them
to be estimated by measuring the time-of-flight (ToF) of a reference sound that is
emitted from the loudspeaker [26, 70] (and vice versa). Because the ToF of a sound
Audio Distance Self-location Pipeline map

recording estimation estimation update
IMU Orientation
observation estimation Pipeline map
Fig. 6.42 Overview of the sound-based localization procedure [5]
Microphone
Microphone & 2 [m] (20 modules)
Reference IMU module
audio signal
USB
IMU CAN
Connected to the robot
0.32 [m]
(a) Localization sensor module (b) Layout of sensor module and snake robot
Fig. 6.43 Design and layout of the sensor module [5]
is mainly affected by the contents and the surface of the pipeline, the method works
in GPS-denied environments.
The proposed method estimates the robot location and the pipeline map simul-
taneously by combining the distance estimated based on the ToF with the robot
orientation that is estimated from the IMU observations [5]. A nonlinear state-space
model that represents the relationships among the observations, the robot location,
and the pipeline map is formulated. The pipeline map is represented as the inner
space of the pipeline. The robot location is estimated using an extended Kalman
filter in an online manner [55], and the pipeline map is estimated based on the past
locus of the robot’s location after each update. Figure 6.42 shows an overview of
the proposed online sound-based method for robot localization and pipeline map-
ping, which combines ToF-based distance estimation with IMU-based orientation
estimation.
As shown in Fig. 6.43, a sensor module containing a microphone and the IMU is
attached to the tail cable of the robot. The microphone on this module is connected to
a synchronized stereo analog-to-digital (A/D) converter. The other input of the A/D
converter is connected to an audio cable that indicates the onset time of the reference
signal that is emitted from the loudspeaker located at the entrance to the pipeline.
The IMU sensor contains a three-axis gyroscope and a three-axis accelerometer, with
the axes indicated in Fig. 6.43a.
6.6.1 ToF-Based Distance Estimation
The distance dk between the robot and the entrance to the pipeline is estimated
by measuring the ToF of a reference signal that is emitted by the loudspeaker. As
Fig. 6.44 Overview of Microphone Loudspeaker

time-of-flight estimation Pipe
procedure [5]
ADC DAC
Audio cable
Microphone
Audio cable
Time of flight
shown in Fig. 6.44, the ToF is measured as the onset time difference between the
microphone and the audio cable that transfers the audio signal that was emitted by
the loudspeaker. We assume that the pipeline is filled with a homogeneous gas and
that the temperature and the pressure in the pipeline are constant. Based on these
assumptions, the distance between the loudspeaker and the microphone is estimated
to be dk = (τkmic − τkref )C, where τkmic and τkref represent the onset times of the signal
recorded using the microphone and the reference audio signal, respectively, and C
is the speed of sound in the pipeline. To provide a robust estimate of the onset time,
we use the onset estimation method that was proposed in [4]. A time-stretched pulse
(TSP) [46] that is robust with respect to noise is used as a reference signal that is
emitted by the loudspeaker. The onset time of the reference signal is calculated using
the generalized cross correlation method with phase transform (GCC-PHAT), which
is robust against reverberation [70].
6.6.2 Orientation Estimation from IMU Observations
The current orientation of the robot is estimated by accumulation of the angular

velocity data observed using the gyroscope and correction of the cumulative error
based on the linear acceleration observed using the accelerometer. The two types of
measurements are integrated using a complementary filter [58]. This filter is known
to converge rapidly and has low computational cost.
The orientation is then used to detect the direction of robot movement. Here,
the x-axis of the sensor module ek , which represents the module orientation, is
assumed to be the direction of movement. Because the IMU-based orientation esti-
mation process has cumulative errors, the raw x-axis vector êk ∈ R3 ( |êk | = 1) that
was estimated using the complementary filter is assigned to one of the axis direc-
tions in the absolute coordinate system to suppress these errors. This assignment is
performed by maximizing the cosine similarity as ek = argmaxe∈E eT · êk , where
E = {ex , −ex , e y , −e y , ez , −ez } represents the set of unit vectors of the candidate
axis directions.
Fig. 6.45 Graphical model

of online sound-based
localization framework [5]
Fig. 6.46 Pipeline map M

represented by a union of
spheres (gray region) and
showing the approximated
sound path from p1 to p N
(red line) [5]
6.6.3 State-Space Model of Robot Location and Pipeline Map
A state-space model that represents the relationships among the robot location, the
pipeline map, and the observations, as shown in Fig. 6.45, is formulated. The latent
variables of the state-space model consist of the robot state and the pipeline map. The
robot state z k = [x kT , vk ]T ∈ R4 consists of the robot’s location x k ∈ R3 and its veloc-
ity vk ∈ R. The pipeline map M ⊂ R3 represents the interior space of the pipeline.
For convenience, the pipeline map is represented here by a union of N spheres
(Fig. 6.46), M = i S ( pi , ri ), where S ( pi , ri ) = {x ∈ R3 |x − pi | < ri } repre-
sents a sphere with center position pi ∈ R3 and radius ri (i = 1, 2, 3, . . . , N ).
State Update Model
The state update model of the robot state p(z k+1 |z k ) is formulated based on two
individual update models:
p(z k+1 |z k ) = p(x k+1 |z k ) p(vk+1 |z k ). (6.18)
The current robot location x k is updated based on the current robot orientation ek
and velocity vk :
p(x k+1 |z k ) = N (z k+1 | x k + vk ek , x|z ) (6.19)
where x|z ∈ R3×3 represents the covariance matrix of the process noise of the robot
location. The velocity update model p(vk+1 |z k ) is represented by a random walk:
p(vk+1 |z k ) = N (vk+1 | vk , (σ v|z )2 ) (6.20)

where σ v|z ∈ R represents the standard deviation of the process noise.

Measurement Model
The distance measurement model p(dk |z k , M ) is formulated using the length of the
reference signal propagation path between the microphone and the loudspeaker:
p(dk |z k , M ) = N (dk | f (M , x k ), (σ τ )2 ) (6.21)
where f (M , x k ) is the propagation path length of the reference signal and σ τ is

the standard deviation of the measurement noise. The length f (M , x k ) between
the location of the robot (i.e., the microphone) x k and the entrance to the pipeline
([0, 0, 0]T ) is defined as the shortest path between these two points on the pipeline
map M . Because it is difficult to determine the shortest path on the map, either
analytically or numerically, the path is approximated as a polyline that connects the
center positions pi of the spheres S ( pi , ri ) used to form the map M , as shown in
Fig. 6.46. f (M , x k ) is then easily calculated using Dijkstra’s algorithm [8].
6.6.4 Estimation Algorithm
The current robot state z k is estimated from the measurements of d1:k and e1:k in
an online manner using an extended Kalman filter (EKF). The derived function
of ∂∂zk f d (M , x k ), which is required by the EKF, is approximated via a numerical
derivation. The pipeline map M , in contrast, is estimated after each update of the
robot state. The space around the robot’s current location can be assumed to be within
the pipeline. Therefore the pipeline map is updated by adding the space around
the robot location x k as M ← M ∪ S (x k , r ), where r > 0 is a parameter that
represents the radius of a sphere to be added.
6.6.5 Evaluation 1: ToF-Based Distance Estimation
The ToF-based distance estimation method was evaluated in a 6-m pipe using a
loudspeaker equipped at the pipe entrance and a microphone that moves within the
pipe.
Experimental Settings
Figure 6.47 shows a mockup pipeline for the evaluation. The pipeline has a length
of 6 m, a diameter of 0.2 m and two elbow sections. A loudspeaker was equipped at
the entrance of the pipeline and a microphone was suspended by a nylon cord in the
pipeline. The distance between the microphone and the loudspeaker was increased
in 0.1 m increments by drawing up the nylon cord. A TSP reference signal used in
this evaluation has a length of 16384 samples (1.024 s) at 16 kHz. The proposed ToF-
based estimation method (gcc-phat/first-peak) was compared with the following two
2nd elbow 0.5 m
0.2 m
Nylon cord
4m
Sensor module
... 0.4m 0.3m 0.2m Loudspeaker
0.34 m
valve
Audio cable / USB

1st elbow 0.29 m 1.38 m
(a) Pipeline configuration
(b) Bottom part (c) Upper part

Fig. 6.47 The mockup pipeline used to evaluate the ToF-based distance estimation
baseline methods: (1) using cross correlation instead of GCC-PHAT (cc), and (2)
extracting the maximum correlation coefficient instead of extracting the first peak
(max-peak) of the GCC-PHAT coefficients.
Experimental Results
Figure 6.48 shows the estimated distance at each actual distance. When the micro-
phone was placed in front of the first elbow, the estimation errors of the proposed
method (gcc-phat / first-peak) were less than 0.1 m. Moreover, when the microphone
was placed beyond the first elbow, the estimation errors of the baseline methods
Fig. 6.48 Estimated distances and their errors with the ToF-based distance estimation
became greater than those of the proposed method. Although the estimation errors
of the proposed method increased when the microphone crossed over each elbow, the
estimation errors were less than 7 % of the actual distances. This result shows that
the proposed distance estimation method works precisely even in high-reverberant
in-pipe environments.
6.6.6 Evaluation 2: Sound-Based Localization of a Moving

Robot
The proposed sound-based localization method was evaluated with a moving in-pipe
snake robot.
Experimental Settings
As shown in Fig. 6.49, the proposed localization method was evaluated under the
following three configurations:
2m
1m
1m
0.13m
0.2m
(a) C-shape Configuration
1m 1m
0.12m
1m
1m
0.2m 0.2m
(b) F-shape Configuration

1m
0.2m
1m
1m
0.13m
(c) Z-shape Configuration
Fig. 6.49 Three configurations for the experimental evaluation. The robot moved in the pipeline
along the red line
C-shape configuration: Two 1-m pipes and one 2-m pipe, both 0.2 m in diameter,
were connected with two elbow pipes to form a horizontal C shape.
F-shape configuration: Four 1-m pipes 0.2 m in diameter were connected with one
elbow pipe and one T-intersection to form an F shape.
Z-shape configuration: Three 1-m pipes 0.2 m in diameter were connected with two
elbow pipes to form a vertical Z shape.
The accelerometer and gyroscope signals were sampled at 200 Hz and 16 bits. The
initial state z 0 = [x 0 , v0 ] was set to x 0 = [0.1, 0, 0] (m) and v0 = 0 (m/s). The
Table 6.6 Precision, recall, and F-measure of the estimated pipeline maps
Method Configuration Precision (%) Recall (%) F-measure (%)
tof-imu-pc C-shape 68.2 90.0 77.6
F-shape 72.7 97.8 83.4
Z-shape 57.7 83.0 68.0
tof-imu C-shape 12.6 16.2 14.1
F-shape 23.7 31.8 27.1
Z-shape 18.5 27.1 22.0
same TSP reference signal as Evaluation 1 was used. The other parameters were
determined experimentally.
In this experiment the accuracy of the estimated pipeline map was evaluated
instead of the location of the sensor module because it was difficult to accurately
determine the ground truth location of the module. Since the proposed method esti-
mates the pipeline map as the union of spherical regions, the accuracy of the map
was evaluated with the volume ratio of the precision and recall as follows:
V(M¯ ∩ M )
Precision(M¯, M ) = (6.22)
V(M¯)
V(M¯ ∩ M )
Recall(M¯, M ) = (6.23)
V(M )
where M¯ and M are the ground truth and estimated pipeline maps, respectively,
and V(A ) represents the volume of A . Since in the F-shape configuration the robot
did not get into the branched middle 1-m pipe (Fig. 6.49b), the pipeline map was
estimated as L-shape. Therefore, we did not take into the volume of the branched
part in this evaluation. To investigate the effectiveness of the perpendicular condition
that assumes pipelines to be straight or connected at right angles, we compared the
proposed method (tof-imu-pc) with a baseline method (tof-imu) that does not assume
the perpendicular condition. That is, tof-imu uses the raw orientation êk estimated
by the IMU-based estimation (Sect. 6.6.2).
Experimental Results
Table 6.6 shows the precision, recall and F-measure for the estimated pipeline map
at each configuration. In the all three configurations, the precision and recall of the
proposed method (tof-imu-pc) have at least 57.7 and 83.0 %, respectively. Figure 6.50
shows the estimated robot location and pipeline map at each time. Although the
estimated robot location at the last measurement in each configuration had more
than 10 cm of the errors from the actual location, tof-imu-pc correctly estimated the
locations of the elbow sections. These results showed the proposed method could
robustly estimate the pipeline map even when the pipeline vertically or horizontally
curved (C-shape or Z-shape), or the pipeline had a branch (F-shape).
Fig. 6.50 The estimated pipeline map and self-location at each measurement, and corresponding
picture of the robot and pipeline are shown. The point, line, and circles indicate the estimated loca-
tion, sound propagation path, and pipeline map, respectively. Black solid lines indicate the ground
truth pipeline map. The sensor module ended up at the dashed black line at each configuration. The
red circles in the pictures indicate the location of the sensor module
(a) C-shape configuration (b) F-shape configuration (c) Z-shape configuration
Fig. 6.51 3D projections of the ground truth pipeline (transparent white) and estimated pipeline
maps (red: tof-imu-pc, blue: tof-imu)
On the other hand, compared to the tof-imu-pc, the precision and recall were
significantly degraded when the perpendicular condition was not assumed (tof-imu).
Figure 6.51 shows the 3D projections of the estimated pipeline maps. The pipeline
maps estimated by the tof-imu curved even at the straight sections because the IMU-
based orientation estimation has accumulated error problem and ToF information
only provides the distance information. One way to improve the proposed sound-
based localization method is combining it with visual sensors for estimating curving
pipelines. The pipeline shape, such as how the pipeline is curving, can be observed
by using the visual odometry or visual-SLAM [13]. Although such a visual-based
method also has the accumulated error problem, it will be overcame by integrating
sound, IMU, and visual sensors on a unified state-space model as a SLAM framework.
6.7 Human Interface for Inspection
To apply our developed snake srobot platforms to pipe inspection tasks, we developed
a teleoperation and information visualization interface that displayed the snake robot
shapes and the contact forces applied by the pipes, stabilized the video feeds from
the robot’s head camera, built a pipe map based on the trajectory of a snake robot,
took photographs of the interiors of pipes for image mapping onto the pipe map, and
created unrolled pictures showing the interior pipe wall.
The developed interface is shown in Fig. 6.52, and includes the following com-
ponents:
A Display of the snake robot’s target shape
B Display of the snake robot’s current shape and contact forces
C Display of the difference between the snake robot’s target and current shapes
D Display of the head camera image
E Display of the stabilized head camera image
F Display of the pipe map
G Display of the unrolled picture of the interior pipe wall
Each of these components is described below.
6.7.1 Display of Snake Robot Target and Current Shapes
In terms of displays A and B , we defined the vehicle-like body frame to ensure that
the visual snake motions were easily understood by the operator when the robot moves
in a helical rolling motion or a crawler gait motion [47]. These motions generally
result in rotation of the links over the whole body based on the rolling motion that
is visualized on the interface, which makes the display difficult for the operator
to understand when the reference frame is fixed at a specific link. Therefore, we
introduced the Virtual Chassis technique [42] to calculate the snake reference body
Fig. 6.52 Snake robot teleoperation user interface for pipe inspection tasks
frame and provide a more user-friendly viewpoint from which to view the snake robot
motion, no matter what kinds of shapes or motions are involved. First, the system
calculates the center of mass of the whole body by averaging the center of mass
positions of all links of the snake robot. Second, the system uses each link’s center
of mass to calculate principal axes of inertia for the whole snake robot body and
estimate the direction in which the snake robot is extended. Third, the system defines
the reference frame for the snake robot body using the principal axes of inertia such
that its x- and y-axes (parallel to the ground) are the first and second components,
and the z-axis is a cross product of the x- and y-axes in the right-handed coordinate
system.
6.7.2 Display of Difference Between Snake Robot Target

and Current Shapes
To make it easier for the operator to recognize how much the robot’s current shape
differs from its target shape, display C overlays the robot CG models of the two
shapes on each other. The target shape is displayed as a transparent model and the
current shape is shown as a nontransparent model to enable them to be distinguished
easily. Using this visualization method, an operator can determine that there is no
difference between the two shapes if the two robot models approximately match
each other, and can also see that there is a difference if the two robot models differ
obviously from one another. A homogeneous transformation matrix is then calculated
so that the target and current shapes are fitted using the least-squares method to
minimize the differences between the corresponding centers of the link masses that
are paired according to their link numbers.
In addition, the system displays spherical markers with colors that change accord-
ing to the differences between the target and current joint angles to enable recognition
of joint angle differences, which are related to the shape differences. The spherical
marker colors are set as gradations from green to red with differences ranging from
0 to 15◦ and are set as red when the differences are greater than 15◦ .
6.7.3 Display of Contact Forces
Display B shows the contact forces that are applied by the pipe to the snake robot
as vector arrows based on information from the CoP sensor (described in Sect. 6.5)
that is installed on every second snake robot link. This display allows the operator
to recognize whether the robot is applying appropriate forces to the pipe to prevent
the robot from slipping or falling, whether the robot is moving into a bent part of the
pipe, and whether the forces applied by the robot to the environment (or the forces
being applied by the environment to the robot) are too high.
The locations of the contact force arrows on the snake robot CG model are cal-
culated using the CoP sensor sheet coordinate values. The force arrow length is set
within a specific range between minimum and maximum values, because arrows that
are too short are difficult for the operator to see, and arrows that are too long will be
off the screen and thus cannot be seen by the operator. The drawn arrow colors are
set at green when the contact force is at a minimum, yellow for a medium contact
force, and red when the contact force is at a maximum.
6.7.4 Display of Stabilized Head Camera Image
Display E shows the stabilized images of the raw head camera images shown on
display D based on the gravity direction obtained using the IMU that is mounted on
the snake robot’s head. This function can prevent rotation of the head camera images
with the robot’s helical rolling motion.
6.7.5 Display of Pipe Map
Display F shows a pipe map that is based on the snake robot’s trajectory within the
pipe. The trajectory, which represents a series of snake robot locations, is estimated
using the sound-based localization and mapping technique (which is described in
Sect. 6.6). The pipe map visualization is displayed in the form of cylinders using
a set of snake robot location points. The nontransparent light blue cylinder parts
indicate the fixed pipe map that was estimated using the sound-based localization and
mapping process, and the transparent light blue cylinder parts indicate an uncertain
pipe map that was calculated using the data from the IMU mounted on the snake
robot head and the current shape. The bend positions on the transparent light blue
cylinder parts were calculated using the IMU data and the current shape. The snake
robot CG model is also displayed on the pipe map at the robot’s current location.
The operator can take photographs using a joystick input and these photographs
will then appear at the positions on the pipe map at which they were taken. This makes
it easier for the operator to understand the interior layout of the pipe at certain points.
The photographs are located on the display without collisions to ensure that they are
not overlaid on each other. Collision avoidance among the photographs displayed on
the screen is achieved by calculation of a translation vector using the repulsive forces
from all the other photographs and an attractive force from the position at which the
individual photograph was taken, and this vector is applied in every drawing frame.
6.7.6 Display of Unrolled Picture of the Interior Pipe Wall
Display G shows the unrolled picture of the interior pipe wall that was formed
using images taken by the snake robot head camera. First, the line segments for
the straight parts are calculated using a pipe map, and point clouds are created that
consist of pipe cylinder shapes with their radii along the line segments. These point
clouds are then transformed into the coordinate frame of the head camera images as
viewed from the head camera positions at which the images were taken. The system
creates two-dimensional point clouds in the image coordinates that correspond to
the three-dimensional point clouds in the inertial frame of reference. The system
then creates meshes on the camera images, which are transformed into meshes of
unrolled pictures, and the whole unrolled pictures are created to correspond to the
point clouds of the pipe. Each image is processed to ensure that its brightness variance
is minimized because the camera image shows large brightness variations between
locations in the light and those in the dark inside the actual pipe, which would make
the stitched images look unnatural. The unrolled picture that is created is scrolled
from left to right on the display to stitch the two ends of the unrolled picture, and a
red indicator line is displayed on the pipe map to show the correspondence between
the center line of the unrolled picture and the pipe shape.
6.7.7 Demonstration Experiment
The developed human interface was tested in the ImPACT-TRC evaluation field.
First, displays A – C of the developed interface were used to achieve the following
tasks:
• Helical rolling motion and passing over a flange on a pipe (see Fig. 6.15)
• Moving over rough terrain using the crawler gait (see Fig. 6.17)
Second, the complete displays of the developed interface were used to achieve
the following tasks:
• Helical rolling motion in a pipe including bend sections (see Fig. 6.12)
• Inspection of the interior condition of the pipe
This test procedure meant that the operator was able to check instrument meters that
had been placed on the outside of the pipe by eyesight alone using the raw/stabilized
head camera images obtained when the snake robot reached the top of the pipe, as
shown in Fig. 6.52. The operator also was able to confirm that the pipe map was
reasonably well constructed and was able to check the interior of the pipe using the
photographs or the unrolled pictures.
6.8 Concluding Remarks
A nonwheeled-type snake robot and a wheeled snake robot have been developed
in the Tough Snake Robot Systems Group. A jamming layered membrane gripper
mechanism is mounted at the head of each robot to allow it to grasp various types of
objects. A whole-body tactile sensor has been developed to measure the contact force
of the robot. A sound-based online localization method for use with in-pipe snake
robots has also been developed. By integrating these platforms and the fundamental
technologies, a human-robot interface has also been constructed.
In the middle of July 2018, flood and landslip disasters were caused by torrential
rain in Western Japan. We dispatched the developed snake robots to gather informa-
tion from houses destroyed by a landslip at Handa-Yama Mountain in Okayama City
on July 25 and 26, 2018.
We hope that our developed robots will contribute to disaster responses, even if
only a little. We dedicate this work to all victims of disasters.
(JST) Agency.
References
1. Ariizumi, R., Matsuno, F.: Dynamic analysis of three snake robot gaits. IEEE Trans. Robot.
33(5), 1075–1087 (2017)
2. Baba, T., Kameyama, Y., Kamegawa, T., Gofuku, A.: A snake robot propelling inside of a pipe
with helical rolling motion. In: SICE Annual Conference, pp. 2319–2325 (2010)
3. Banconand, G., Huber, B.: Depression and grippers with their possible applications. In: 12th
International Symposium for Immunology of Reproduction (ISIR), pp. 321–329 (1982)
Microphone-accelerometer based 3D posture estimation for a hose-shaped rescue robot. In:
2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 5580–
5586 (2015). https://doi.org/10.1109/IROS.2015.7354168
5. Bando, Y., Suhara, H., Tanaka, M., Kamegawa, T., Itoyama, K., Yoshii, K., Matsuno, F., Okuno,
H.G.: Sound-based online localization for an in-pipe snake robot. In: 2016 IEEE International
Symposium on Safety, Security, and Rescue Robotics (SSRR2016), pp. 207–213 (2016). https://
doi.org/10.1109/SSRR.2016.7784300
6. Bark, C., Binnenbose, T., Vogele, G., Weisener, T., Widmann, M.: In: In: MEMS 98. IEEE
Eleventh Annual International Workshop on Micro Electro Mechanical Systems. An Investi-
gation of Micro Structures, Sensors, Actuators, Machines and Systems, pp. 301–305 (1998)
7. Biagiotti, L., Lotti, F., Melchiorri, C., Vassura, G.: Mechatronic design of innovative fingers
for anthropomorphic robot hands. In: 2003 IEEE International Conference on Robotics and
Automation (ICRA), pp. 3187–3192(2003)
8. Dijkstra, E.W.: A note on two problems in connexion with graphs. Numer. Math. 1(1), 269–271
(1959). https://doi.org/10.1007/BF01386390
9. Dollar, A.M., Howe, R.D.: A robust compliant grasper via shape deposition manufacturing.
IEEE/ASME Trans. Mech. 11(2), 154–161 (2006)
10. Dollar, A.M., Howe, R.D.: Simple, : robust autonomous grasping in unstructured environments.
In: 2007 IEEE International Conference on Robotics and Automation (ICRA), pp. 4693–4700
(2007)
11. Fujita, T., Shimada, K.: Characteristics and applications of magnetorheological fluids. J.-Magn.
Soc. Jpn. 27(3), 91–100 (2003)
12. Fujita, M., Tadakuma, K., Komatsu, H., Takane, E., Nomura, A., Ichimura, T., Konyo, M.,
Tadokoro, S.: Jamming layered membrane gripper mechanism for grasping differently shaped-
objects without excessive pushing force for search and rescue. Adv. Robot. 32(11), 590–604
(2018). https://doi.org/10.1080/01691864.2018.1451368
13. Hansen, P., Alismail, H., Rander, P., Browning, B.: Pipe mapping with monocular fisheye
imagery. In: 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems
(IROS), pp. 5180–5185(2013). https://doi.org/10.1109/IROS.2013.6697105
14. Hasegawa, H., Suzuki, Y., Ming, A., Koyama, K., Ishikawa, M., Shimojo, M.: Net-structure
proximity sensor: high-speed and free-form sensor with analog computing circuit. IEEE/ASME
Trans. Mech. 20(6), 3232–3241 (2015)
15. Hayashi, M., Sagisaka, T., Ishizaka, Y., Yoshikai, T., Inaba, M.: Development of functional
whole-body flesh with distributed three-axis force sensors to enable close interaction by
humanoids. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS),
pp. 3610–3615 (2007). https://doi.org/10.1109/IROS.2007.4399360
16. Hirose, S.: Biologically Inspired Robots: Snake-Like Locomotor and Manipulator. Oxford
University Press, Oxford, UK (1987)
17. Hirose, S., Umetani, Y.: Development of soft gripper for the versatile robot hand. Mech. Mach.
Theory 13, 351–359 (1978)
18. Ho, C., Muammer, K.: Design and feasibility tests of a flexible gripper based on inflatable
rubber pockets. Int. J. Mach. Tools Manuf. 46(12), 1350–1361 (2006)
19. Ilievski, F., Mazzeo, A.D., Shepherd, R.F., Chen, X., Whitesides, G.M.: Soft robotics for
chemists. Angew. Chem. 123(8), 1930–1935 (2011)
20. ISO: Safety of machinery - Permanent means of access to machinery - Part 3: stairs, stepladders
and guard-rails. ISO 14122-3 (2001)
21. Jacoff, A.: Standard test methods for response robots. ASTM International Standards Commit-
tee on Homeland Security Applications; Operational Equipment; Robots (E54.08.01) (2016)
22. Kamegawa, T., Yamasaki, T., Igarashi, H., Matsuno, F.: Development of the snake-like rescue
robot KOHGA. In: 2004 IEEE International Conference on Robotics and Automation (ICRA),
pp. 5081–5086 (2004)
23. Kamegawa, T., Harada, T., Gofuku, A.: Realization of cylinder climbing locomotion with
helical form by a snake robot with passive wheels, In: 2009 IEEE International Conference on
Robotics and Automation (ICRA), pp. 3067–3072 (2009)
24. Kawasaki, H., Komatsu, T., Uchiyama, K.: Dexterous anthropomorphic robot hand with dis-
tributed tactile sensor: Gifu hand II. IEEE/ASME Trans. Mech. 7(3), 296–303 (2002). https://
doi.org/10.1109/TMECH.2002.802720
25. Kim, D.Y., Kim, J., Kim, I., Jun, S.: Artificial landmark for vision-based SLAM of water pipe
rehabilitation robot. In: 2015 12th International Conference on Ubiquitous Robots and Ambient
Intelligence (URAI2015), pp. 444–446 (2015). https://doi.org/10.1109/URAI.2015.7358900
26. Knapp, C., Carter, G.: The generalized correlation method for estimation of time delay. IEEE
Trans. Acoust. Speech Signal Process. 24(4), 320–327 (1976). https://doi.org/10.1109/TASSP.
1976.1162830
27. Kon, K., Tanaka, M., Tanaka, K.: Mixed integer programming based semi-autonomous step
climbing of a snake robot considering sensing strategy. IEEE Trans. Control Syst. Tech. 24(1),
252–264 (2016). https://doi.org/10.1109/TCST.2015.2429615
28. Kouno, K., Yamada, H., Hirose, S.: Development of active-joint active-wheel high traversability
snake-like robot ACM-R4.2. J. Robot. Mech. 25(3), 559–566 (2013)
29. Krys, D., Najjaran, H.: Development of visual simultaneous localization and mapping
(VSLAM) for a pipe inspection robot. In: 2007 International Symposium on Computational
Intelligence in Robotics and Automation (CIRA2007), pp. 344–349 (2007). https://doi.org/10.
1109/CIRA.2007.382850
30. Liljeback, P., Pettersen, K.Y., Stavdahl, O., Gravdahl, J.T.: Snake Robots. Springer, Berlin
(2013)
31. Lim, H., Choi, J.Y., Kwon, Y.S., Jung, E.-J., Yi, B.-J.: SLAM in indoor pipelines with 15mm
diameter. In: 2008 IEEE International Conference on Robotics and Automation (ICRA2008),
pp. 4005–4011 (2008). https://doi.org/10.1109/ROBOT.2008.4543826
32. Maruyama, R., Watanabe, T., Uchida, M.: Delicate grasping by robotic gripper with incom-
pressible fluid-based deformable fingertips. In: 2013 IEEE/RSJ International Conference on
Intelligent Robots and Systems (IROS), pp. 5469–5474(2013)
33. Monkman, G.J., Hesse, S., Steinmann, R., Schunk, H.: Robot Grippers. Wiley, New York
(2007)
35. Murtra, A.C., Tur, J.M.M.: IMU and cable encoder data fusion for in-pipe mobile robot
localization. In: 2013 IEEE Conference on Technologies for Practical Robot Applications
(TePRA2013), pp. 1–6 (2013). https://doi.org/10.1109/TePRA.2013.6556377
36. Nú ñez, C.G., Navaraj, W.T., Polat, E.O., Dahiya, R.: Energy-autonomous, flexible, and trans-
parent tactile skin. Adv. Funct. Mater. 27(18) (2017). https://doi.org/10.1002/adfm.201606287
37. Ohashi, T., Yamada, H., Hirose, S.: Loop forming snake-like robot ACM-R7 and its serpenoid
oval control. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp.
413–418(2010). https://doi.org/10.1109/IROS.2010.5651467
38. Okatani, Y., Nishida, T., Tadakuma, K.: Development of universal robot gripper using MRα
fluid. In: 2014 Joint 7th International Conference on Soft Computing and Intelligent Systems
(SCIS) and 15th International Symposium on Advanced Intelligent Systems (ISIS), pp. 231–
235 (2014)
39. Pettersson, A., Davis, S., Gray, J.O., Dodd, T.J., Ohlsson, T.: Design of a magnetorheological
robot gripper for handling of delicate food products with varying shapes. J. Food Eng. 98(3),
332–338 (2010)
40. Qi, W., Kamegawa, T., Gofuku, A.: Helical wave propagate motion on a vertical pipe with a
branch for a snake robot. In: 2nd International Symposium on Swarm Behavior and Bio-Inspired
Robotics (SWARM2017), pp. 105–112 (2017)
41. Rollinson, D., Choset, H.: Pipe network locomotion with a snake robot. J. Field Robot. 33(3),
322–336 (2016)
42. Rollinson, D., Buchan, A., Choset, H.: Virtual chassis for snake robots: definition and appli-
cations. Adv. Robot. 26(17), 2043–2064 (2012). http://dblp.uni-trier.de/db/journals/ar/ar26.
html#RollinsonBC12
43. Romano, J.M., Hsiao, K., Niemeyer, G., Chitta, S., Kuchenbecker, K.J.: Human-inspired robotic
grasp control with tactile sensing. IEEE Trans. Robot. 27(6), 1067–1079 (2011). https://doi.
org/10.1109/TRO.2011.2162271
44. Shimojo, M., Araki, T., Teshigawara, S., Ming, A., Ishikawa, M.: A net-structure tactile sensor
covering free-form surface and ensuring high-speed response. In: 2007 IEEE/RSJ International
Conference on Intelligent Robots and Systems (IROS), pp. 670–675 (2007). https://doi.org/
10.1109/IROS.2007.4399084
45. Suzuki, Y.: Multilayered center-of-pressure sensors for robot fingertips and adaptive feed-
back control. IEEE Robot. Autom. Lett. 2(4), 2180–2187 (2017). https://doi.org/10.1109/LRA.
2017.2723469
46. Suzuki, Y., Asano, F., Kim, H.-Y., Sone, T.: An optimum computer-generated pulse signal
suitable for the measurement of very long impulse responses. J. Acoust. Soc. Am. 97(2),
1119–1123 (1995). https://doi.org/10.1121/1.412224
47. Takemori, T., Tanaka, M., Matsuno, F.: Gait design for a snake robot by connecting curve
segments and experimental demonstration. IEEE Trans. Robot. PP-99, 1–8 (2018). https://doi.
org/10.1109/TRO.2018.2830346
48. Takemori, T., Tanaka, M., Matsuno, F.: Ladder climbing with a snake robot. In: 2018 IEEE/RSJ
International Conference on Intelligent Robots and Systems (IROS) (2018)
49. Tanaka, M., Matsuno, F.: Control of snake robots with switching constraints: trajectory tracking
with moving obstacle. Adv. Robot. 28(6), 415–419 (2014). https://doi.org/10.1080/01691864.
2013.867285
50. Tanaka, M., Matsuno, F.: Modeling and control of head raising snake robots by using kinematic
redundancy. J. Intell. Robot. Syst. 75(1), 53–69 (2014). https://doi.org/10.1007/s10846-013-
9866-y
51. Tanaka, M., Tanaka, K.: Control of a snake robot for ascending and descending steps. IEEE
Trans. Robot. 31(2), 511–520 (2015). https://doi.org/10.1109/TRO.2015.2400655
52. Tanaka, M., Kon, K., Tanaka, K.: Range-sensor-based semiautonomous whole-body collision
avoidance of a snake robot. IEEE Trans. Control Syst. Tech. 23(5), 1927–1934 (2015). https://
doi.org/10.1109/TCST.2014.2382578
53. Tanaka, M., Nakajima, M., Tanaka, K.: Smooth control of an articulated mobile robot with
switching constraints. Adv. Robot. 30(1), 29–40 (2016). https://doi.org/10.1080/01691864.
2015.1102646
54. Tanaka, M., Nakajima, M., Suzuki, Y., Tanaka, K.: Development and control of articulated
mobile robot for climbing steep stairs. IEEE/ASME Trans. Mech. 23(2), 531–541 (2018).
https://doi.org/10.1109/TMECH.2018.2792013
55. Thrun, S., Burgard, W., Fox, D.: Probabilistic Robotics. MIT Press, Cambridge (2005)
56. Tomo, T.P., Wong, W.K., Schmitz, A., Kristanto, H., Sarazin, A., Jamone, L., Somlor, S.,
Sugano, S.: A modular, distributed, soft, 3-axis sensor system for robot hands. In: IEEE-RAS
16th International Conference on Humanoid Robots, pp. 454–460(2016). https://doi.org/10.
1109/HUMANOIDS.2016.7803315
57. Transeth, A.A., Leine, R.I., Glocker, C., Pettersen, K.Y.: Snake robot obstacle-aided locomo-
tion: modeling, simulations and experiments. IEEE Trans. Robot. 24(1), 88–104 (2008)
58. Valenti, R.G., Dryanovski, I., Xiao, J.: Keeping a good attitude: a quaternion-based orientation
filter for IMUs and MARGs. Sensors 15(8), 19302–19330 (2015). https://doi.org/10.3390/
s150819302
59. Vogt, D.M., Park, Y.-L., Wood, R.J.: Design and characterization of a soft multi-axis force
sensor using embedded microfluidic channels. IEEE Sens. J. 13(10), 4056–4064 (2013). https://
doi.org/10.1109/JSEN.2013.2272320
60. Yamada, H., Hirose, S.: Study on the 3D shape of active cord mechanism. In: 2006 IEEE
International Conference on Robotics and Automation (ICRA), pp. 2890–2895 (2006). https://
doi.org/10.1109/ROBOT.2006.1642140
61. Yamada, H., Hirose, S.: Study of active cord mechanism –approximations to continuous curves
of a multi-joint body. J. Robot. Soc. Jpn. (in Japanese with English summary) 26(1), 110–120
(2008)
62. Yamada, H., Takaoka, S., Hirose, S.: A snake-like robot for real-world inspection applications
(the design and control of a practical active cord mechanism). Adv. Robot. 27(1), 47–60 (2013)
63. Yamaguchi, A., Atkeson, C.G.: Implementing tactile behaviors using FingerVision. In: IEEE-
RAS 17th International Conference on Humanoid Robots, pp. 241–248(2017). https://doi.org/
10.1109/HUMANOIDS.2017.8246881
64. Yatim, N.M., Shauri, R.L.A., Buniyamin, N.: Automated mapping for underground pipelines:
an overview. In: 2014 2nd International Conference on Electrical, Electronics and System Engi-
neering (ICEESE2014), pp. 77–82 (2015). https://doi.org/10.1109/ICEESE.2014.7154599
65. Yim, M., Duff, G.D., Roufas, D.K.: PolyBot: a modular reconfigurable robot. In: 2000 IEEE
International Conference on Robotics and Automation (ICRA), pp. 514–520 (2000). https://
doi.org/10.1109/ROBOT.2000.844106
66. Yoshikai, T., Fukushima, H., Hayashi, M., Inaba, M.: Development of soft stretchable knit sen-
sor for humanoids’ whole-body tactile sensibility. In: IEEE-RAS 9th International Conference
on Humanoid Robots, pp. 624–631(2009). https://doi.org/10.1109/ICHR.2009.5379556
67. Yoshikai, T., Tobayashi, K., Inaba, M.: Development of 4-axis soft deformable sensor for
humanoid sensor flesh. In: IEEE-RAS 11th International Conference on Humanoid Robots,
pp. 205–211 (2011). https://doi.org/10.1109/Humanoids.2011.6100905
68. Yoshida, K., Muto, T., Kim, J.W., Yokota, S.: An ER microactuator with built-in pump and
valve. Int. J. Autom. Technol. (IJAT) 6(4), 468–475 (2012)
69. Yu, P., Liu, W., Gu, C., Cheng, X., Fu, X.: Flexible piezoelectric tactile sensor array for dynamic
three-axis force measurement. Sensors 16(6), (2016). https://doi.org/10.3390/s16060819
70. Zhang, C., Florencio, D., Zhang, Z.: Why does PHAT work well in lownoise, reverberative
environments? In: 2008 International Conference on Acoustics, Speech, and Signal Processing
(ICASSP2008), pp. 2565–2568 (2008). https://doi.org/10.1109/ICASSP.2008.4518172
71. Zhu, T., Yang, H., Zhang, W.: A Spherical Self-Adaptive Gripper with shrinking of an elas-
tic membrane. In: 2016 International Conference on Advanced Robotics and Mechatronics
(ICARM), pp. 512–517 (2016)
Chapter 7
WAREC-1 – A Four-Limbed Robot with
Advanced Locomotion and Manipulation
Capabilities
Kenji Hashimoto, Takashi Matsuzawa, Xiao Sun, Tomofumi Fujiwara, Xixun

Wang, Yasuaki Konishi, Noritaka Sato, Takahiro Endo, Fumitoshi Matsuno,
Naoyuki Kubota, Yuichiro Toda, Naoyuki Takesue, Kazuyoshi Wada, Tetsuya
Mouri, Haruhisa Kawasaki, Akio Namiki, Yang Liu, Atsuo Takanishi and
Satoshi Tadokoro
Abstract This chapter introduces a novel four-limbed robot, WAREC-1, that has
advanced locomotion and manipulation capability with versatile locomotion styles.
At disaster sites, there are various types of environments through which a robot must
K. Hashimoto (B)
Meiji University, 1-1-1 Higashi-Mita, Tama-ku, Kawasaki-shi, Kanagawa 214-8571, Japan
e-mail: hashimoto@meiji.ac.jp
T. Matsuzawa · X. Sun
Waseda University, 17 Kikui-cho, Shinjuku-ku, Tokyo 162-0044, Japan
e-mail: contact@takanishi.mech.waseda.ac.jp
X. Sun
T. Fujiwara · X. Wang · Y. Konishi · T. Endo · F. Matsuno
Kyoto University, Kyodaikatsura, Nishikyo-ku, Kyoto 615-8540, Japan
e-mail: fujiwara.tomofumi.6w@kyoto-u.ac.jp
X. Wang
e-mail: ojijin93@gmail.com
Y. Konishi
e-mail: konishi.yasuaki.68w@st.kyoto-u.ac.jp
T. Endo
e-mail: endo@me.kyoto-u.ac.jp
F. Matsuno
e-mail: matsuno.fumitoshi.8n@kyoto-u.ac.jp
N. Sato
Nagoya Institute of Technology, Gokiso-cho, Syowa-ku, Nagoya Aichi 466-8555, Japan
e-mail: sato.noritaka@nitech.ac.jp
N. Kubota · N. Takesue · K. Wada
Tokyo Metropolitan University, 6-6 Asahigaoka, Hino, Tokyo 191-0065, Japan
e-mail: kubota@tmu.ac.jp
N. Takesue
e-mail: ntakesue@tmu.ac.jp
https://doi.org/10.1007/978-3-030-05321-5_7
328 K. Hashimoto et al.
traverse, such as rough terrain filled with rubbles, narrow places, stairs, and vertical
ladders. WAREC-1 moves in hazardous environments by transitioning among var-
ious locomotion styles, such as bipedal/quadrupedal walking, crawling, and ladder
climbing. WAREC-1 has identically structured limbs with 28 degrees of freedom
(DoF) in total with 7-DoFs in each limb. The robot is 1,690 mm tall when standing
on two limbs, and weighs 155 kg. We developed three types of actuator units with
hollow structures to pass the wiring inside the joints of WAREC-1, which enables
the robot to move on rubble piles by creeping on its stomach. Main contributions
of our research are following five topics: (1) Development of a four-limbed robot,
WAREC-1. (2) Simultaneous localization and mapping (SLAM) using laser range
sensor array. (3) Teleoperation system using past image records to generate a third-
person view. (4) High-power and low-energy hand. (5) Lightweight master system
for telemanipulation and an assist control system for improving the maneuverability
of master-slave systems.
7.1 Overview of WAREC-1
Disasters such as earthquakes, storms, floods, etc., are a global occurrence. Recovery
work and field surveys are required at disaster sites. However, in some situations,
disaster locations are not easily accessible to humans. Therefore, disaster response
robots are necessary to conduct tasks in hazardous environments.
K. Wada
e-mail: k_wada@tmu.ac.jp
Y. Toda
Okayama University, 3-1-1 Tsuhima Naka, Kita, Okayama 700-8530, Japan
e-mail: ytoda@okayama-u.ac.jp
T. Mouri · H. Kawasaki
Gifu University, 1-1 Yanagido, Gifu 501-1193, Japan
e-mail: mouri@gifu-u.ac.jp
H. Kawasaki
e-mail: h_kawasa@gifu-u.ac.jp
A. Namiki · Y. Liu
Chiba University, 1-33 Yayoi-cho, Inage-ku, Chiba 263-8522, Japan
e-mail: namiki@faculty.chiba-u.jp
Y. Liu
e-mail: aeaa5023@chiba-u.jp
A. Takanishi
Waseda University, 2-2, Wakamatsu-cho, Shinjuku-ku, Tokyo 162-8480, Japan
S. Tadokoro
7 WAREC-1 – A Four-Limbed Robot with Advanced Locomotion … 329
Flying robots can reach anywhere without making physical contact with the envi-
ronment. However, most flying robots still have difficulties in heavy load manipula-
tion even though there are some studies on load carriage and powerful manipulation
for such robots [26, 43]. Crawler robots such as PackBot [61] and Quince [63] are
often used at disaster sites. However, it is difficult for crawler robots to climb spiral
stairs and vertical ladders. The DARPA Robotics Challenge (DRC) was held from
2012 to 2015 with an aim to develop semi-autonomous ground robots that can per-
form complex tasks in dangerous, degraded, human-engineered environments [5].
Several types of robots entered the DRC. Most robots were legged robots or leg-
wheeled robots, such as DRC-Hubo+ [28], Atlas [2], CHIMP [55], Momaro [50],
RoboSimian [20], JAXON [25], and WALK-MAN [59]. However, most of them
cannot climb a vertical ladder because the DRC did not include a task of vertical lad-
der climbing. Hardly can we find disaster response robots with capability of ladder
climbing put into use in disaster area, while vertical ladders are frequently seen in
common buildings, infrastructures and places without sufficient conditions to equip
electrical devices to move vertically or without sufficient space to equip stairs.
Ladder climbing of robots has been studied for decades. In 1989, LCR-1, a ladder
climbing robot with grippers at the end of four limbs was developed [16]. After that,
both solutions of humanoid robots such as Gorrila-III [12], HRP-2 [60] and E2-DR
[64] as well as multi-legged robots like ASTERISK [11] were provided. These robots
can climb a vertical ladder, and some of them can also transition between bipedal and
quadrupedal locomotion depending on the ground surface conditions. However, it is
still difficult for such legged robots to traverse rough terrains filled with rubble piles.
The long-term goal of our research is to develop a legged robot that has both
advanced locomotion and manipulation capabilities in extreme environments, as
shown in Fig. 7.1. A legged robot team in “ImPACT Tough Robotics Challenge
(TRC)” consists of 10 universities or research institutes, and 11 research topics
are conducted (see Fig. 7.2). In order to improve manipulation capability, we work
with the robot hand team and develop an end-effector of a legged robot. We work
with teams of SLAM, image processing, and sound processing to recognize robot’s
environment. Aiming to realize a legged robot with high mobility and manipulation
capability, we also cooperate with teams for remote control. We also study hydraulic
actuators for the purpose of increasing the output and impact resistance of a legged
robot.
So far, we have integrated simultaneous localization and mapping (SLAM), a
teleoperation system using past image records to generate a thirdperson view, a
high-power low-energy hand, and a lightweight master system for telemanipulation.
As a result, the approach to the valve and the valve turning with an opening torque
of 90 Nm were realized remotely. Figure 7.3 shows the appearance of the robot when
these technologies are integrated. This chapter describes details of these integration
technologies.
This section presents WAREC-1 (WAseda REsCuer - No. 1), a four-limbed robot
that has advanced locomotion capabilities with versatile locomotion styles, including
bipedal walking, quadrupedal walking, vertical ladder climbing, and crawling on its
stomach. Section 7.1.1 describes the categorization of extreme environments and
Fig. 7.1 Long-term goal of a four-limbed robot having versatile locomotion styles such as
bipedal/quadrupedal walking, crawling, and ladder climbing
Fig. 7.2 Legged robot research team in ImPACT-TRC program
the concept of disaster-response robot having advanced locomotion capabilities in

extreme environments. Section 7.1.2 describes the robot design overview and the
details of mechanical design, illustrates the electrics and control system, and presents
the overview of the motion generation of WAREC-1. In Sect. 7.1.3, experimental
results are shown. Section 7.1.4 provides the conclusions and discusses future work.
Fig. 7.3 Integrated system overview of a legged robot
Table 7.1 Requirements for disaster response robot working in extreme environments
Function Item Anticipated difficulty
classification
Locomotion Rough terrain Various terrain shapes and profiles
Ladder Safety cage
Narrow space Limited working space/poor visibility
Manipulation Opening/closing valves Various mounting positions, shapes and
required torques
Operating switches Various mounting positions and shapes
Using tools for humans Various shapes and weights
Sensing Obtaining external information Vision recognition in low-visibility
environment/sound source localization
Robustness Recovery from falling Robust hardware/robust control
7.1.1 Classification of Extreme Environments and

Four-Limbed Robot with Advanced Locomotion
Capability
After the Great Eastern Japan Earthquake, committees of Council on

Competitiveness-Nippon (COCN), consisting of approximately 100 members from
Japanese government ministries, companies, universities, etc., considered an estab-
lishment plan for a disaster response robot center, and presented a report in 2013
[1, 4]. Requirements for disaster response robots are examined in the 2013 report
prepared by COCN, and they can be broadly summarized as presented in Table 7.1.
Locomotion and manipulation are essential functions for disaster response robots. It
is also necessary to have sufficient capability of sensing and recognition for informa-
tion to be gathered in the extreme environments. In this section, we mainly discuss
locomotion as the first step toward the long-term goal.
Inclination
wall
vertical
ladder
vertical ladder
w/ safety cage
inclined
stairs
road
spiral
stairs
flat road gravel road rocky area

low ceiling Unevenness
Fig. 7.4 Classification of extreme environments and a four-limbed robot having versatility in
locomotion styles
According to the COCN report, a disaster-response robot must overcome stairs,

steps, slopes, ladders, uneven terrains filled with rubbles, and narrow spaces such as
a door and a corridor. The US National Institute for Standards and Testing (NIST)
proposes a standard test apparatus to evaluate robot mobility such as stepfield pallets
[17, 18], however, it cannot reproduce ladders and narrow spaces.
In terms of the physical attributes of such environments, we consider that an
extreme environment can be characterized by three indexes: unevenness, narrowness,
and inclination (see Fig. 7.4). The higher indexes an environment has, the more
difficult it will be for the robots to traverse the environment.
“A gravel road” and “a rocky area” are examples of an environment with large
unevenness and small inclination. Crawling motion can be an effective solution to
move on rough terrain with medium unevenness, such as a gravel road and rubble
piles, because crawling motion can maintain high stability by making the robot’s
trunk contact the ground. Hirose et al. proposed a normalized energy (NE) stability
margin as an evaluation criterion for legged robot stability [15]. This margin is
the difference of robot CoM between the initial position and critical position of
falling when it rotates around the line connecting the supporting legs. Based on this
criterion, the lower is the height of CoM, the larger is the NE stability margin that can
be obtained. Thus, compared with bipedal or quadrupedal locomotion, crawling is
considered more effective at enabling all robot limbs and body to contact the ground
(Fig. 7.5). The arms of bipedal humanoid robots are usually much weaker than the
legs. Therefore, bipedal humanoid robots cannot lift up their body with only the
arms. If a humanoid robot has limbs identical to each other, namely a four-limbed
Fig. 7.5 Comparison of a

normalized energy stability
margin between quadruped
walking and crawling motion
robot, and has arms as powerful as legs, the robot can lift itself up with two arms and
climb over a large obstacle.
“A low ceiling place” and “a narrow corridor” are examples of an environment
with high narrowness and small inclination. A four-limbed robot can pass through a
narrow space with two limbs, e.g., by balance beam walking and crab walking. The
robot can also pass through a low ceiling place by crawling motion, making its body
contact the ground.
“Stairs” are examples of an environment with large unevenness and medium
inclination. A four-limbed robot can climb up and down stairs by quadrupedal loco-
motion. “Spiral stairs” are placed in the environment of narrow “stairs.” Bipedal
locomotion will be suitable for such a narrow space because the turning radius of
bipedal locomotion is smaller than that of quadrupedal locomotion.
“A stone wall” and “a vertical ladder” are examples of the highly inclined envi-
ronment. “A vertical ladder with a safety cage” is positioned in the environment with
high inclination and narrowness.
A legged robot can adapt to all the environments shown in Fig. 7.4. In particular,
we consider that a four-limbed robot is effective for such environments.
7.1.2 WAREC-1 Robot Design
In order to realize a robot that can move in extreme environments, we propose a four-
limbed robot capable of various locomotion styles–not only a bipedal/quadrupedal
walking mode but also a crawling mode and a ladder climbing mode. Figure 7.6
illustrates the overview and DoF configuration of WAREC-1. The robot is 1,690 mm
tall when standing on two limbs, and weighs 155 kg. The specifications of each
joint, such as rated torque, rated speed, motor power, reduction ratio, and movable
angles, are presented in Table 7.2. Regarding the movable angle, the reference point
is the state where the limb corresponding to the left leg is extended in the direction
of gravity as depicted in Fig. 7.6. The requirements of each joint were determined
through dynamic simulation of bipedal walking, ladder climbing, and crawling.
1690
x y
(a) Overview (b) DoF configuration
Fig. 7.6 WAREC-1 (WAseda REsCuer - No. 1)
Table 7.2 Specifications of WAREC-1

Hip/Shoulder Knee/Elbow Ankle/Wrist
Pitch Roll Yaw Pitch Yaw Pitch Roll
Rated torque (Nm) 317 222 222 127 95 95 95
Rated speed (rpm) 11.2 9.7 9.7 15.5 11.7 11.7 11.7
Motor power (W) 735 580 580 580 370 370 370
Reduction ratio 100 160 160 100 160 160 160
Movable angle −107 to −57 to −180 to −38 to 163 −180 to −93 to 93 −106 to
(deg) 107 150 180 180 106
7.1.2.1 Robot Design Overview
(a) Robot Dimensions

The dimensions of stairs and ladders are defined by various standards such as
MIL (Military Specification Standards) and JIS (Japanese Industrial Standards) (see
Tables 7.3 and 7.4). Although a small robot can withstand the shock of falling, adapt-
ing to human environment will be difficult because the dimensions of stairs and lad-
ders are specified in reference to adult human size. Therefore, we considered that
the robot size should also be of an adult human. However, a disaster response robot
must also pass through narrow spaces such as hatches, a low ceiling place, and a
narrow corridor. Dimensions for rectangular access openings for body passage are
standardized by MIL-STD-1472F, as presented in Table 7.5. In this study, we target
narrow spaces of 410 × 690 mm (top and bottom access with bulky clothing). We
Table 7.3 Stair dimensions

Dimension (mm) MIL-STD-1472F Japanese Building Standard
Act
Tread depth 240–300 150–260
Riser height 125–200 160–230
Table 7.4 Fixed ladder dimensions

Dimension (mm) MIL-STD-1472F JIS B 9713-4
Rung thickness 19–38 20–35
Rung spacing 230–380 225–300
Width between stringers 460–530 400–600
Table 7.5 Dimensions for rectangular access openings for body passage
Dimensions Depth Width
Clothing Light Bulky Light Bulky
Top and bottom 330 410 580 690
access (mm)
Side access (mm) 660 740 760 860
determined the body dimension of the robot to be able to pass through such a narrow
space. We will target narrower space with light clothing in the future.
(b) Degree of Freedom and Movable Angle of Joints
For bipedal walking and manipulation tasks, each limb should have at least 6 degrees
of freedom (DoF). We decided to provide 7-DoFs to each limb to have redundancy.
Regarding the number of limbs, there are several options, such as four limbs, six
limbs, eight limbs, etc., but we selected four limbs to reduce the robot weight. All
four limbs share the same structure, which enables them to be used as both arms and
legs. Furthermore, robustness against mechanical failure increases because the robot
can continue to move with two limbs if one or two limbs are broken. Figure 7.6b
illustrates the DoF configuration of WAREC-1. There are 3-DoFs at the hip/shoulder
joint, 1-DoF at the knee/elbow joint, and 3-DoFs at the ankle/wrist joint. Movable
angles of each joint are designed as large as possible to expand robot’s workspace.
7.1.2.2 Mechanical Design
Since contact between WAREC-1 and environment is expected for its available loco-
motion styles, efforts were made to reduce the total amount of wiring required and
to avoid exposure of wirings to external environments. Moreover, wiring often limits
the movable angle of each joint of the robot. Therefore, we decided to pass the wire
Table 7.6 Specifications of actuator units with hollow structure

High Output Medium Output Small Output
Dimension (mm) φ153 × 132 φ126 × 131 φ121 × 114

Hollow diameter φ22 φ22 φ17
(mm)
Mass (kg) 5.8 3.4 2.4
Motor W (TQ 735 (ILM115 × 580 (ILM85 × 23) 370 (ILM70 ×
Systems) 25) 18)
Reduction ratio 100 160 100 160
Rated torque 317 222 127 95
(Nm)
Rated speed 11.2 9.7 15.5 11.7
(rpm)
inside the joints of the robot. We developed three types of actuator units (high output,
medium output, and small output) with a hollow structure, based on the requirements
for each joint of WAREC-1 (see Table 7.6). For the actuator unit of medium output,
two reduction ratios of 160 and 100 were prepared.
(a) Drive Joints
Figures 7.7 and 7.8 depict a cross-sectional view and an exploded view of the designed
actuator unit (high output), respectively. We adopted TQ-systems’ frameless motor
(RoboDrive), which has a large output torque with respect to the mass. As a speed
reducer, we selected the CSD series of Harmonic Drive Systems Inc. Torque gen-
erated between the rotor (14) of the frameless motor and the stator (15) is input to
the wave generator (09) of the harmonic drive via the output-side motor shaft (13).
Then, it is decelerated by the harmonic drive and is output to the output flange (04).
As an encoder for detecting the rotation angle of the rotor (14), a hollow-shaft type
magnetic incremental encoder (resolution: 14,400 cpr) (21, 24) from Renishaw plc.
is mounted. A magnetic absolute encoder (resolution: 19 bits) (25, 27) is mounted in
order to detect the angle of the output shaft after deceleration.
The wirings passing through the hollow shaft of actuator units are power sup-
ply lines for actuators and motor drivers, a CAN communication line, and a serial
communication line of a 6-axis force/torque sensor. These lines must be connected
to the computer system inside the body. The wiring through each limb as depicted
in Fig. 7.9. The hollow diameter of the joint was determined from the thickness of
the wiring passing through the inside of actuator units. Because wiring goes out of
Fig. 7.7 Cross-sectional view of the actuator unit (high output)
Fig. 7.8 Exploded view of the actuator unit (high output)
the joint at the connection between each actuator unit, we installed a wiring cover
there. O-rings (06, 17, 20) and oil seals (16) are used to prevent the entry of dust
from outside and grease leakage from the reducer. Furthermore, we chose a sealed
bearing for the deep groove ball bearing (12). The framework of the four-limbed
robot is mainly fabricated from aluminum alloy A7075 in order to realize both low
weight and high stiffness.
Fig. 7.9 Wiring passing z

through the inside of actuator
units
y
Fig. 7.10 Design and

dimension of the body
x y
(b) Body Design

Figure 7.10 depicts the design and dimension of the body. A control PC and an inertial
measurement unit (IMU) are mounted inside the body. At this stage, we have not
installed a battery yet, but there is space to install a battery in the body. The surface
of the body is designed to be sufficiently strong; it will not break even if the body
contacts external environment when creeping on its stomach. The surface of the body
has a concave shape in order to prevent even a slight slippage during creeping.
(c) End-effector Design
Focusing on locomotion ability, an end-effector should be designed such that a four-
limbed robot can walk with two/four limbs and climb a vertical ladder. The require-
ments of an end-effector are as follows:
• Capability of making surface contact with ground for bipedal walking.
• Capability of hanging on rungs of a vertical ladder.
Humans hang on rungs of a vertical ladder by their hands; however, it is difficult
for a robot to hang on a ladder and support its weight by only its fingers because
the force necessary for hanging on a ladder becomes large and the size and weight
of end-effectors tend to increase. Therefore, the end-effector is designed to have
a shape like a hook with grooves without any actuators for hanging on a ladder
(see Fig. 7.11).
We designed two types of end-effectors. The difference between Type A and Type
B is the position of the hook of #3. The grooves of #1 and #2 are used as a foot,
and the hook of #3 is used as a hand when climbing up and down a vertical ladder
(see Fig. 7.12). The grooves for a foot (#1 and #2) have a triangle shape to make
it easier to be hooked on rungs. The groove #1 makes it easier to avoid collision
(a) Type A: Hook of #3 on the palm side (b) Type B: Hook of #3 on the back of the hand
Fig. 7.11 Design of the end-effector
(a) Working as a foot (b) Working as a hand
Fig. 7.12 Different usage of the end-effector
between rungs and the shank of the robot when climbing a vertical ladder when
compared to groove #2 because the distance from the ankle joint to the groove #1 is
longer than that between the ankle joint and the groove #2. The groove #2, however,
is useful in reducing the torque of the ankle joint because the moment arm length
of the groove #2 is shorter than that of the groove #1. Whether to use the groove
#1 or #2 depends on the situation. Regarding the hook of #3, the end-effector of
Type A requires larger movable angle for the wrist pitch joint, which is more than
90◦ as shown in Fig. 7.13a. Therefore, we chose the end-effector of Type B, which
requires smaller movable angle for the wrist pitch joint during climbing a ladder (see
Fig. 7.13b). The sizes of the grooves and hook are designed to enable the robot to
hang on rungs and side rails with diameter of 19–38 mm, which is stipulated by the
JIS and MIL standards.
7.1.2.3 Electrical and Control System
We adopted a distributed control system and a small servo driver, which can be
used in a distributed configuration. Figure 7.14 depicts the electrical system of
WAREC-1. A single board computer (96 × 90 mm) of the PC/104-Plus standards
with Intel®Atom™Processor E3845 1.91 GHz was selected for the CPU board for
controlling whole body motion and is mounted inside the body. Communication
(a) Type A: Hook of #3 on the palm side (b) Type B: Hook of #3 on the back of the hand
Fig. 7.13 Comparison of required movable angle for the pitch joint of the wrist
Fig. 7.14 Electrical system of WAREC-1
between the CPU board and servo drivers is realized using an internal network based
on CAN (Controller Area Network). We use three CAN interface boards, each of
which has two CAN ports. One CAN port controls four to six motors.
As for a servo driver, we selected Elmo Motion Control’s ultra-small driver, Gold
Twitter (G-TWI 25/100 SE, dimension: 35 × 30 × 11.5 mm, mass: 18.6 g). The rated
input voltage is 10 to 95 V, the maximum continuous power output is 2,015 W, and
the DC continuous current is 25 A. Since this servo driver is ultra-small, it has poor
maintainability. There are no connectors but just pins. Therefore, we designed and
developed a relay board (55 × 45 × 15 mm, Fig. 7.15) having connectors for each
purpose such as encoders, CAN communication, etc. The relay board can be stacked
on the servo driver, and each servo driver is mounted near each actuator unit.
Fig. 7.15 Relay board for

the servo driver
After a servo driver receives data of an incremental encoder and an absolute

encoder, the rotation angle of the motor is calculated. The calculated results are
transmitted to the CPU board by CAN communication. At the same time, the servo
driver achieves position control so as to follow a target rotation angle transmitted
from the CPU board. Furthermore, the servo driver also senses the motor temperature
and motor current.
As sensors used for stabilization control, an IMU manufactured by LP-
RESEARCH Inc. is mounted inside the body, and data are acquired by serial commu-
nication (RS-232). 6-axis force/torque sensors are mounted between each ankle/wrist
joint and the corresponding end-effectors. We also acquire these data by serial com-
munication (RS-422). Since the number of serial communication ports is not suffi-
cient with only the CPU board, one serial communication board having four RS-422
ports is stacked on the CPU board.
7.1.2.4 Motion Generation
So far, motion of the robot is generated by an external computer. The motion data
including the position and posture of end-effectors are sent to the host computer
mounted on the robot, and reference joint angles are calculated by inverse kinematics
inside the host computer.
We use 7-DoF inverse kinematics combined with a pseudo-inverse Jacobian to
calculate the joint angles of WAREC-1. Inverse kinematics of this form enables
multiple prioritized tasks [65]. Here, the first task with higher priority is tracking
of the desired end-effector trajectory, and the second task with lower priority is
reaching the target joint angles of the robot. In another word, the robot tries to get as
close to the target joint angles as possible while guaranteeing that the end-effector is
tracking the desired trajectory. The specific equation with pseudo-inverse Jacobian
and expression of subtask is shown as follows:
q̇ = J †r˙1 + (I − J † J )H (r2d − r2 ) (7.1)
where q̇ 7×1 is the angular velocity of the joints, r˙1 6×1 is the vector of the first task,
which is the velocity of an end-effector here, H 7×7 is a weight matrix with 7 DoFs,
7×1
r2d is the desired vector of the second task, which is the target joint angles here, r27×1
is the actual vector of the second task, I is an identity matrix, J 6×7 is the Jacobian,
and J †7×6 is the pseudo-inverse of J .
With the inverse kinematics above and an appropriate r2d given, self-collision
caused by limit over of joint angle(s) can be avoided in advance. Here, r2d is obtained
empirically at present.
Regarding the end-effector trajectory generation, we use the methods previously
reported [32, 57]. The following sections briefly explain how to generate motions of
vertical ladder climbing and crawling on rough terrain.
(a) Vertical Ladder Climbing [57]
Figure 7.16 presents a flowchart of motion generation of ladder climbing for WAREC-
1. Here, “Input limb(s) to move, target position and orientation of end-effector(s)”
is done manually by the operator, while the others are processed automatically. The
motion generation includes the following parts: (1) path-time independent trajec-
tory planning with path length minimization according to the given midpoints; (2)
stabilization of ladder climbing motion based on stability conditions on ladder. In
trajectory planning, arc-length parameterization is used to separate path and time
profile in trajectory planning. With path planned by cubic spline interpolation and
path length minimized, time planning along the planned path can be given freely to
meet our requirement, such as speed and acceleration adjustment for the protection
of motors and dynamic obstacle avoidance. Stability conditions are also considered
to guarantee that the robot will not fall or rotate, especially in the case of 2-point
contact ladder climbing (2 limbs of the robot moves simultaneously).
(b) Crawling on Rough Terrain [32]
The key factor in determining the crawling motion gait of a four-limbed robot is the
manner that its torso moves in locomotion. There are two types of crawling motion
gait. For one type of gait, the torso of robot keeps contact with the ground; this gait
was applied to the robots such as Chariot [36] and the C/V robot [41]. For the other
type of gait, the torso of robot contacts the ground intermittently. One advantage of
the former gait is that it makes the center of gravity of the robot remains lower than
the latter gait, thereby reducing the risk of damage by the impact force when the
robot falls down or turns over. However, when the former gait is applied to a robot
getting over rubble, it becomes more difficult for the robot to move because the torso
of robot gets stuck on rubble protruding from the ground more easily. Taking this
disadvantage into consideration, we decided to choose the latter gait for the crawling
motion mentioned herein.
The proposed crawling motion is shown in Fig. 7.17. Considering the speed of
the robot, it is desirable to reduce the number of phases in the motion. Therefore, the
Fig. 7.16 Flowchart of motion generation for vertical ladder climbing
crawling motion consists of two phases: the feet stance phase and the torso stance
phase, which occur alternately as the robot moves forward.
It is preferable for the feet and torso of the robot to move as vertically and horizon-
tally as possible to reduce the risk of collision with rubble while they move forward.
Figure 7.18 illustrates the trajectories of the feet and torso. To generate the trajectory
of the feet, four points are set and connected. The foot trajectory can be described as
follows: first, all feet are lifted up; second, they go forward horizontally; and third,
they descend vertically. The trajectory of the torso follows the same order as that of
Fig. 7.17 Crawling motion
(a) Feet trajectory (b) Torso trajectory
Fig. 7.18 Trajectories of the feet and torso during proposed crawling motion
the feet. The upward motion of the torso is equivalent to the downward motion of
the feet and vice versa.
7.1.3 Experiments
We conducted several experiments to verify the effectiveness of WAREC-1. First,

we conducted performance evaluation of an actuator unit developed for WAREC-
1. Next, we evaluated the motion performance of WAREC-1 by making the robot
perform various locomotion styles. In these experiments, robot motions are generated
by an external computer in advance. The data including the position and posture of
the end-effectors are sent to the robot PC, and reference angles of each joint are
calculated by solving inverse kinematics.
7.1.3.1 Performance Evaluation of Actuator Unit
We designed and developed a test bed including a powder brake, a spur gear, and
a torque/rotational speed meter in order to evaluate an actuator unit developed for
WAREC-1. Figure 7.19 shows the overview of the test bed with an actuator unit (high
output).
We operated the actuator unit at a constant rotational speed, and a speed command
was given so that the motor rotational speed would be in five stages of approximately
10–70 deg/s. Evaluation experiments were conducted by increasing the supply cur-
(a) Overview of experimental setup
(b) Actual experimental setup
Fig. 7.19 Test bed for performance evaluation of actuator unit
rent to the powder brake and increasing the output torque of the actuator unit, and
then we measured motor torque and rotational speed by the torque and rotational
speed meter.
Figure 7.20 shows experimental results. We can see that the rated torque of 317 Nm
was output at the rated speed of 67.2 deg/s (= 1.2 rpm). Therefore, it can be said that
the actuator unit (high output) meets the required specification in the continuous
operation range.
Fig. 7.20 Experimental result of performance evaluation of actuator unit (high output)
Fig. 7.21 Vertical ladder climbing
7.1.3.2 Vertical Ladder Climbing
We conducted a vertical ladder climbing experiment. We prepared an experimental

field having a vertical ladder with a safety cage. The distances between rungs and
side rails are 250 and 600 mm, respectively. The diameter of the rungs is 19 mm.
The cage is 800 mm in width, as specified by the JIS standard (JISB9713-4). We
generated robot motion so that the robot climbs up and down the ladder with 2-point
contact such that the two limbs of the robot move simultaneously.
The robot succeeded in climbing up the ladder as shown in Fig. 7.21, and we
confirmed the fundamental effectiveness of our robot. It took about 10 s to climb
up a rung in this experiment. The average power consumption was approximately
1,500 W while climbing the ladder.
Fig. 7.22 Moving on rubble

piles by creeping on its
stomach
7.1.3.3 Crawling on Rough Terrain
We conducted crawling experiments on uneven terrain where wooden, sponge, and

concrete blocks were randomly placed; their thickness varies from 20 to 100 mm.
In this research the robot moves all of its end-effectors forward simultaneously, lifts
the torso, and goes forward and more variations of gaits will be considered in the
future. In this way, we can treat the torso as a leg, and the feet and the torso contact
the ground alternately.
From experiments, it was confirmed that the robot could go forward on rubble as
shown in Fig. 7.22. The step length is 200 mm and the walking cycle is 12 s/step.
7.1.3.4 Transition Among Each Locomotion Style
WAREC-1 is designed to be able to stand with two/four limbs and crawl on its
belly. Therefore, we conducted transition experiments so that the robot transforms
locomotion styles among a two-limb stance, a four-limb stance and crawling. The
Fig. 7.23 Transition among a two-limb stance, a four-limb stance, and crawling on its belly
transition motion was generated so that the ZMP existed inside the support area
formed by the support points between the end-effector and the ground [13].
As an experimental result, we confirmed that WAREC-1 could conduct the transi-
tion among all locomotion styles. First, the robot stood with two limbs and changed
to crawling motion through the four-limb stance. After moving forward by crawling
on its belly, WAREC-1 stood up on the opposite two limbs as depicted in Fig. 7.23.
Since WAREC-1 has no distinction between arms and legs, WAREC-1 can perform
a handstand, which is difficult for ordinary humanoid robots.
7.1.4 Conclusions and Future Work
This section describes a novel four-limbed robot, WAREC-1, that has advanced
locomotion capability in hazardous environments. At disaster sites, there are var-
ious types of environments through which a robot must move, such as rough ter-
rain with rubble piles, narrow places, stairs, and vertical ladders. To move in such
environments, we proposed a four-limbed robot that has various locomotion styles,
such as bipedal/quadrupedal walking, crawling, and ladder climbing. WAREC-1 was
designed to have identically structured limbs. The number of DoFs for the whole
body is 28, with 7-DoFs in each limb. The robot weighs 155 kg, and has a height
of 1,690 mm when standing on two legs. We developed three types of actuator units
with hollow structure to pass the wiring inside the joints of WAREC-1. The body has
a concave shape in order to prevent slippage during crawling on rubble piles. The
end-effector has a hook-like shape. The grooves working as a foot when climbing
a vertical ladder have a triangle shape to make it easier to be hooked on rungs. The
hook working as a hand is on the back of the end-effector. Through fundamental
experiments using the test bed to evaluate actuator units with hollow structure, we
confirmed that the actuator units developed for WAREC-1 meet the required speci-
fications in the continuous operation range. Through experiments using WAREC-1,
we confirmed that WAREC-1 could climb up and down a vertical ladder with a
safety cage, move on rough terrain filled with rubble piles by creeping on its stom-
ach, and transform locomotion styles among a two-limb stance, a four-limb stance,
and crawling.
Our next goal is to conduct further research on trajectory generation, dynamics,
and control on WAREC-1. We will improve the end-effector to be able to play the
role of hands for manipulation tasks and research on supervised autonomy using
perception sensors.
7.2 Simultaneous Localization and Mapping Using Laser

Range Sensor Array
Simultaneous Localization and Mapping (SLAM) is a fundamental problem for

autonomous mobile robots because such a robot is used to explore in unknown
and/or dynamic environment and to perform a decision making according to a facing
situation in the environment. For example, such a robot is required to measure mul-
timodal environmental data in order for remote operators to understand the situation
of disaster as a 3D environmental model in real time. There are three main roles; (1)
environmental sensing, (2) environmental monitoring, (3) environmental modelling.
The environmental sensing is used to measure multimodal environmental data and
to extract features for movement and operation required for a given task in real time.
The environmental monitoring is used to perceive the environmental change over
time. The environmental modelling is used to visualize overall situation and to con-
duct task planning through map building. In this paper, we focus on environmental
sensing and modelling in a disaster situation.
There have been various methods for SLAM, but we have to reduce the com-
putational cost while keeping the accuracy of localization and mapping. There are
two main approaches to conduct SLAM; full SLAM and online SLAM [58]. The
full SLAM estimates entire path and map for building the accurate map. However,
the computational cost of the full SLAM is very high. On the other hand, the online
SLAM that estimates current robot pose and map can realize the real-time pose esti-
mation system in the unknown environment. Since the environment may be changing
in a disaster situation over time, we focus online SLAM in this paper. Various types
of methods for SLAM have been proposed such as Extended Kalman Filter (EKF)
SLAM, Graph SLAM, visual SLAM. The EKF SLAM algorithm uses the EKF to
conduct online SLAM using maximum likelihood data associations. In the EKF
SLAM, a feature-based map is used with point landmarks. Graph SLAM solves a
full SLAM problem in offline using all data obtained until the current time, e.g.,
all poses and all features in the map. Therefore, Graph SLAM has access to the
full data when building the map. Furthermore, cooperative SLAM (C-SLAM) has
been also discussed in the study of multi-robot systems. We also proposed various
types of methods for SLAM. Especially, we applied evolution strategy (ES) to reduce
computational cost while improving the self-localization accuracy.
The problem on SLAM is categorized into three problems according to the feature
of a target problem; (1) position tracking or local localization, (2) global localization
or initial localization, and (3) Kidnapped robot problem. In the position tracking
problem, we assume the initial position and posture is given to a robot, and then,
the robot conducts the self-localization through map building. On the other hand, an
initial map is given to a robot in case of the global localization. The robot estimates
the self-position according to the initial scan of distance data, and afterward the robot
conduct the position tracking while updating the given map. The kidnapped robot
problem is a special case of global localization where there are many candidates
of the position and postures in the map corresponding to the current measurement
distance date. This problem often occurs in a building composed of same size of
rooms with the same shape of doors and pillars in the equal interval. Since we focus
on disaster situations, we deal with local and global localization problems in this
paper.
In general, since a robot has to move on rough terrain in a disaster situation, we
should use 3D environmental map rather than 2D environmental map. Further-more,
we should use the redundant number of range sensors in case of disaster situation,
because some range sensors may break down owing to some unexpected troubles such
as collapse of building and collision with debris. Various types of 3D range sensors
such as LiDAR (Laser Imaging Detection and Ranging) have been developed for
self-driving cars, but the vertical resolution is not enough for the 3D environmental
map. Therefore, we develop a laser range sensor array (LRSA) composed of plural
2D laser range finders (LRF) with pan or tilt mechanism in this study. Next, we
discuss the advantage of two LRF [23], and we propose a feature extraction method.
Next we propose a method of SLAM using LRSA.
This paper is organized as follows. Section 7.2 explains a measurement method
and feature extraction method of LRSA. Section 7.3 explains several methods of
SLAM based on ES (ES-SLAM). Finally, we show several experimental results of
ES-SLAM method by using SLAM benchmark datasets.
7.2.1 Measurement Using Laser Range Sensor Array
7.2.1.1 Laser Range Sensor Array
This section explains the hardware mechanism and software module of a laser range
sensor array (LRSA). Figure 7.24 shows the first prototype of LRSA (LRSA-1) com-
posed of two LRF. 2 servo actuators where two LRF is attached in parallel. This
mechanism of LRSA-1 is similar to a kind of stereo vision system, but LRSA-1 can
measure 3D distance directly. Tables 7.8 and 7.7 show the specification of LRF (UST-
20LX) and actuator (FHA-8C). Figure 7.25 shows An example of measurement by
two LRF. This shows the measurement result by LRSA-1 can cover the surrounding
area including several people and walls by one scan. Since the individual LRF can
be controlled with a pan mechanism, we can obtain two 3D distance image with
Fig. 7.24 LRSA-1 composed of two LRF
Table 7.7 Specification of LRF

Model UST-20LX
Measurement range 0.06−10 [m]
Measurement accuracy ±40 [mm]
Scan angle 270 [deg]
Scan time 25 [ms]
I/O Ethernet 100BASE-TX
Table 7.8 Specification of actuator

Model FHA-8C
Maximum torque 3.3 [Nm/A]
Maximum rotational speed 120 [r/min]
Positioning accuracy 120 [s]
Reduction ratio 50
Encoder type Absolute
different view. Here we can control individual LRF with different pan velocity and
different measurement range. In this way, we can conduct the intensive measure-
ment like selective attention or sparse measurement for the global recognition of a
surrounding area.
The position and posture of LRSA in the global coordinate is represented by
(x, y, z, θr oll , θ pitch , θ yaw ) to be estimated though the movement of a mobile robot.
The local coordinate of measurement data is calculated by using the current posture
of each LRF based on the center of two LRF shown in Fig. 7.24.
Figure 7.26 shows the second prototype of LRSA (LRSA-2) composed of two
sets of two LRF (LRSA-1) with a wing mechanism. If the wing angle is changed,
its corresponding sensing range is changed. In this way, we can change the attention
range according to the position of a target. Figure 7.27 shows the third prototype
of LRSA (LRSA-3) composed of two sets of two LRF where the attached angle of
individual LRF is different in two LRF and the LRSA-3 is mounted on the body
of WAREC (Fig. 7.28). We can build a 3D environmental map, but we use 2D plain
Fig. 7.25 An example of measurement by two LRF
Fig. 7.26 LRSA-2 composed of two sets of two LRF with a wing mechanism
map constructed by the 3D environmental map in order to reduce computational cost.

Here, the measurement plain of individual LRF corresponding to each of x − y, y −
z, z − x plains is adjusted to the current 2D plain map by the servo actuators. This
mechanism is similar to the visual servoing by time-series of camera image.
Fig. 7.27 LRSA-3 composed of two sets of two LRF where the attached angle of individual LRF
is different in two LRF
Fig. 7.28 An equipped result of LRSA-3 on WAREC
7.2.1.2 2D Map Building
Basically, we use a method of occupancy grid mapping. Figure 7.29 shows the concept
of the occupancy grid map. Here the value of all cells is initialized at 0. In this research,
we use the simple definition of the occupancy grid map as follows:
hitt (x, y)
mapt (x, y) = (7.2)
hitt (x, y) + errt (x, y)
where hitt (x, y) and err t (x, y) are the number of measurement and through points
of LRF until the tth step, respectively. The measurement data is represented by
(di , θi ), i = 1, 2, . . . , M, j = 1, 2, . . . , L, where di is measurement distance from
LRF; θi is the angle of the measurement direction; M is the number of total measure-
ment directions; L i (α Res · di ) is the number of resolution for the map building by
the occupancy grid model. Therefore, the map is updated by following procedure:
where (x p , y p ) is the position of the mobile robot; r p is the posture; di is measurement
distance from LRF in the ith direction; θi is the angle of the measurement direction;
α M A P is the scale factor mapping from the real world to the grid map.
Algorithm 1: Map-update:
1: for i=1 to M do
2: for j = 1 to L i do
3: u i, j = Lji (di cos(θi + r p )) + x p
4: vi, j = Lji (di sin(θi + r p )) + y p
5: xi, j = [α Map · u i, j ]
6: yi, j = [α Map · vi, j ]
7: yi, j = [α Map · vi, j ]
8: if j = L then
9: hitt (xi, j , yi, j ) = hitt−1 (xi, j , yi, j ) + 1
10: else
11: errt (xi, j , yi, j ) = errt−1 (xi, j , yi, j ) + 1
12: end if
12: end for
13: end for
Fig. 7.29 Concept image of the occupancy grid map
Fig. 7.30 Definition of occlusion area in our sensor array
7.2.1.3 Feature Extraction from 3D Distance Data of Two

Range Sensors
First, we define a feature point extracted from measurement data by LRSA-1. The
measurement range can be divided into 4 regions, (1) measurement range by LRF-1,
(2) measurement range by LRF-2, (3) overlapping range measured by LRF-1 and
LRF-2, and (4) unmeasurable range. We can use the overlapping range to extract
features based on disparity between LRF-1 and LRF-2 (Fig. 7.30).
Figure 7.31 shows two distance images measured by LRF-1 and LRF-2 where k
is the measurement ID of one scan, j1 and j2 are the discrete joint angle ID of LRF-1
Fig. 7.31 Sequence distance data in LRSA-1
and LRF-2, respectively. We use dynamic time warping (DTW) used for temporal
template matching in dynamic programming for extracting feature points from the
measurement data. A cost function is defined as
ck ( j1 , j2 ) = |dk,
1
j1 − dk, j2 |
2
(7.3)
where d lk , il is the distance between two measurement point. Next, we show a

formulation of DTW in the following;
(i) Optimal Value Function
Vk ( j1 , j2 ) ≡ minimum cost so far, arriving at state( j1 , j2 ), (7.4)

k indicates the measurement number of LRF. (7.5)
(ii) Recurrence Relation

⎡ ⎤
Expansion :Vk ( j1 − 1, j2 )
Vk ( j1 , j2 ) = ck ( j1 , j2 ) + min ⎣ Match :Vk ( j1 − 1, j2 − 1) ⎦ (7.6)
Contraction :Vk ( j1 , j2 − 1)
(iii) Boundary Condition
Vk ( j1 , j2 ) = ck (1, 1) (7.7)
Answer is given by Vk (J1 , J2 ). In the previous method, we can evaluate the match-
ing degree between two templates corresponding to the measurement data, but we
use the degree of difference as a feature. If expansion or contraction occurs in the
matching, this indicates the measurement from one side is possible, but the measure-
ment from the other is impossible shown in Figs. 7.32 and 7.33, i.e., there may occur
a local occlusion owing to the disparity. Next, we show an example of 3D map build-
ing (Fig. 7.34) where the number of measurement points is 180 (J1 = J2 = 180).
Figure 7.35 shows an experimental result of feature extraction; (a) green and yellow
dots indicate measurement data from LRF1 and 2, respectively; (b) red and blue
Fig. 7.32 Contraction action
Fig. 7.33 Expansion action
Fig. 7.34 Experimental environment

(a) 3D reconstruction (b) Feature extraction
Fig. 7.35 Result of feature extraction. In a, green and yellow dots indicate measurement data
from LRF1 and 2, respectively. In b, red and blue dots indicate expansion and con-traction actions,
respectively
(a) Full data
(b) Feature extraction. Red and blue dots indicate feature point. Green dot indicate other points.
Orange circles indicate examples of false detection.
Fig. 7.36 Result of feature extraction (k = 170)
dots indicate expansion and contraction actions, respectively. The number of mea-
surement data is 384,748, but the number of feature points is only 5315. Figure 7.36
shows a result of feature extraction (k = 170); (a) Full data; (b) Feature extraction
where red and blue dots indicate individual feature points. The extracted features
include useful and important information such as edges or corners.
7.2.2 SLAM Using Laser Range Sensor Array
7.2.2.1 Evolution Strategy for Localization
In this subsection, we explain our real-time localization method based on an evolu-

tionary computation. At first, we give a brief summary of basic localization methods.
Bayesian methods such as Kalman filter and particle filter, have often been used for
localization [58]. The target of Bayesian methods in the localization is to obtain the
point maximizing the likelihood through the estimation of a probability density dis-
tribution of the self-position over time where measurement data and control inputs
are given. However, it often takes much computational cost and time to estimate such
a probability density distribution, On the other hand, we can use a stochastic opti-
mization method such as simulated annealing methods and evolutionary optimization
methods instead of Bayesian approaches. The target of stochastic optimization meth-
ods in the localization is to obtain the solution maximizing the likelihood directly
without the estimation of a probability density distribution. The computational cost
and time can be reduced by using a stochastic optimization method, but we should
consider the diversity of candidate solutions because the stochastic bias may occur
in the stochastic optimization. Therefore, we have used evolutionary optimization
methods for the localization in this study.
Evolutionary computation is a field of simulating evolution on a computer. From
the historical point of view, the evolutionary optimization methods can be divided into
genetic algorithm (GA), evolutionary programming (EP), and evolution strategy (ES)
[10]. These methods are fundamentally iterative generation and alternation processes
operating on a set of candidate solutions called a population. All the population
evolves toward better candidate solutions by selection operation and genetic operators
such as crossover and mutation. The selection decides candidate solutions evolving
into the next generation, which limits the search space spanned by the candidate
solutions. Since ES can be discussed easily from stochastic point of view in the
optimization, we use ES for the localization.
ES was proposed by Rechenberg [46], and further extended by Schwefel [51].
Basically, ES are classified into (μ + λ)-ES and (μ, λ)-ES. First, we show below
the procedure of a standard (μ + λ)-ES;
begin
Initialization
repeat
Creation (λ)
Evaluation
Selection (μ)
until termination_condition is True
end.
Initialization randomly generates an initial population of individuals. We apply
(μ + λ)-ES for estimating the position and posture of a robot where μ and λ indicate
the number of parent and offspring population generated in a single generation,
respectively. Creation (λ) generates λ children from μ parents by a crossover and/or

a mutation. As a result, the (μ + λ)-ES has the intermediate population of μ +
λ individuals. Selection (μ) deterministically selects the best μ individuals from
the intermediate population. On the other hand, in (μ, λ)-ES, Selection (μ) selects
the best μ individuals only from the created λ children. Therefore, (μ + λ)-ES is
considered as a continuous model of generation, while the (μ, λ)-ES is considered
as a discrete model of generation. Especially, as the special cases of the evolution
strategies, (1,1)-ES is a random search, (1+1)-ES is an iterative improvement method,
and (1, λ)-ES or (1+λ)-ES is a multi-point neighboring searches. Furthermore, (μ +
1)-ES is the same as a steady state GA known as a minimal continuous model of
generation, and the local search performance of (μ + 1)-ES is high while (μ + 1)-ES
maintains the genetic diversity as a population. The important feature of ES is in the
self-adaptation which can self-tune the diversity of mutation parameters according
to the success records. Rechenberg suggested,
The ratio of successful mutations to all mutations should be 1/5. If this ratio is
greater than 1/5, increase the variance; if it is less, decrease the variance.
We apply (μ + 1)-ES for the self-localization. A candidate solution is com-posed
of numerical parameters of revised values to the current position (gk,x , gk,y ) and
rotation (gk,r ). We use the elitist crossover and adaptive mutation. Elitist crossover
randomly selects one individual and generates an individual by incorporating genetic
information from the selected individual and best individual in order to obtain feasible
solutions rapidly. Next, the following adaptive mutation is performed to the generated
individual,

f max − f k
gk,h ← gk,h + α SSG A · + β SSG A · N (0, 1) (7.8)
f max − f min
where f k is the fitness value of the kth individual, f max and f min are the maximum
and minimum of fitness values in the population; N (0, 1) indicates a normal random
value; α SSG A and β SSG A are the coefficient and offset, respectively. While the self-
adaptive mutation refers to its own fitness record, the adaptive mutation refers to
the average, maximum, and minimum of fitness values of the candidate solutions in
the population, i.e., the adaptive mutation relatively changes the variance according
to the fitness values of the candidate solutions. A fitness value of the kth candidate
solution is calculated by the following equation,

M
f itk = ptocc (xi,L , yi,L ) · mapt (xi,L , yi,L ) (7.9)
i=1
M
hitt (xi,L , yi,L )
ptocc (xi,L , yi,L ) = i=1
(7.10)
M
i=1 hitt (xi,L , yi,L ) + M
i=1 errt (xi,L , yi,L )
1 if hitt (xi,L , yi,L ) > 0

hitt (xi,L , yi,L ) = (7.11)
0 else if errt (xi,L , yi,L ) > 0
1 if errt (xi,L , yi,L ) > 0
errt (xi,L , yi,L ) = (7.12)
0 else if hitt (xi,L , yi,L ) > 0
where the summation of the map values is basic fitness value in (μ + 1)-ES and
ptocc indicates a penalty function. The summation of the map values is high if the
estimation result is high. Furthermore, the penalty function has low value if many
measurement points exist on empty cells. Therefore, this problem is defined as a
maximization problem. Actually, we can estimate the robot pose by using only the
summation of the map values. However, the estimation method sometimes gets stuck
in local optima according to the environment if we use only the summation of the
map value. Therefore, we use the penalty function ptocc for avoiding the situation. The
localization based on (μ + 1)-ES is finished when the number of iteration reaches
the maximum number of iteration T . Algorithm 2 shows the total procedure of the
ES-SLAM. The ES-SLAM is very simple algorithm and it is easy to implement.
Algorithm 2: Total procedure of ES-SLAM:

1. t = 0, hitt (x, y) = 0 and errt (x, y) = 0
2. Input the LRF data
3. if t = 0 then
4. Estimate the robot pose using (μ + 1)-ES
5. end if
6. Perform Map-update
7. t = t + 1
8. return to step 1
We conducted an experiment of the ES-SLAM by using two SLAM

benchmark datasets. We used only measurement data of LRF form these datasets.
Figure 7.37 shows the ground truth of each dataset. In this experiment, we used two
conditions. Condition 1 used the fitness function with ptocc . Condition 2 used the
fitness function without ptocc . ES-SLAM was run on 3.5GHz 6-Core Intel Xeon E5
processor. Table 7.9 shows the parameters using these experiments. Figures 7.38 and
7.39 show an example of the experimental result of Condition 1 in each dataset. In
these results, the ES SLAM can correctly localize and build the map compared to
Figs. 7.37, 7.38 and 7.39. Figure 7.40 shows the transition of variance of the best
fitness value in each time step. The variance values are stable between until about the
1400th step in both results. On the other hand, the Condition 1 is more stable than
the Condition 2 from about the 1400th step. Figure 7.41 shows a failed example of
the Condition 2. In Fig. 7.41, red circles mean that the localization is often failed in
these areas because (μ + 1)-ES gets stuck in local optima in these areas if the fitness
Table 7.9 Setting parameters for the experiments

Number of trial in each experiment 10
Number of parents μ 100
Maximum number of iterations T 1000
Coefficients for adaptive mutation α1 , α2 10.0
Coefficients for adaptive mutation α3 1.0
Offset for adaptive mutation β 0.01
Cell size α Cell 100
(a) Freiburg Indoor Building 079

(b) MIT CSAIL Building
Fig. 7.37 Correct map building result using the benchmark datasets
(a) Map building (b) Localization
Fig. 7.38 Experimental results of map building and localization in Freiburg Indoor Building 079.
In b, red line indicates localization result
function does not include the penalty function ptocc . In addition, the average of the
computational time of both results is about 18 [ms], and we consider that this com-
putational time is enough for online SLAM. In this way, SLAM based on (μ + 1)-ES
can build the map and localize the robot position by designing the suitable fitness
function according to the map building method.
7.2.2.2 Global Localization Using 2D Grid Map
We sometimes have to deal with a global localization problem, but it takes much
time to solve a global localization problem because the search space is quite large.
(a) Map building (b) Localization
Fig. 7.39 Experimental results of map building and localization in MIT CSAIL Building. In b, red
line indicates localization result
Fig. 7.40 Variance of the

best fitness value in MIT
CSAIL Building
Therefore, we propose a global localization method using a multi-resolution map.

Basically, a multi-resolution map is built from the original map based on the mea-
surement data. The values of the original map built by the SLAM are (0, 1), but
we consider a partially occupied cell as an occupied cell in the 1st level of map as
follows:
1 if map0 (x, y) > 0.5

map1 (x1 , y1 ) = (7.13)
0 otherwise
where (xk , yk ) is the position of the mobile robot in the kth layer. Figure 7.42 shows
the basic concept of multi-resolution map. A value of the lower resolution map is
Fig. 7.41 A failed example

of Condition2
Fig. 7.42 A diagram of

multi-resolution maps
calculated from the higher resolution map in the following;

1
1
mapk+1 (xk+1 , yk+1 ) = mapk (xk + i, yk + j) (7.14)
i=0 j=0
xk = 0, k 2 , 2k 2 , . . . , X, yk = 0, k 2 , 2k 2 , . . . , Y, k = 1, 2, . . . , K (7.15)
As the increase of k, the map information becomes sparse. In this paper, we get
the low-resolution map that is built by substituting k to Eq. (7.15). The information
of mapk (xk , yk ) is the following equation,
mapk (xk , yk ) ∈ { − 22(k−1) , −22(k−1) + 1, . . . ,

(7.16)
− 1, 0, 1 , . . . , 22(k−1) − 1, 22(k−1) }
where the number of possible values is 2k + 1. Next, we normalize the value of

mapk (xk , yk ) in order to calculate the state of each cell such as empty state, occupied
state, and uncertain state;
1
n k (xk , yk ) = mapk (xk , yk ) (7.17)
22(k−1)
Basically, if the state of a cell is uncertain, the value of mapk (xk , yk ) approaches 0.
However, the value of mapk (xk , yk ) can be 0 in some cases in addition to the unknown
state, since the mapk (xk , yk ) is determined by the simple summation. Therefore, in
order to extract the unknown state from uncertain states, we define a mapkU nk (xk , yk )
by the following equation.

1
1
U nk
mapk+1 (xk+1 , yk+1 ) = |mapk (xk + i, yk + j)| (7.18)
i=0 j=0
Furthermore, we define the degree of unknownness in the search,
1
n Uk nk (xk , yk ) = mapkU nk (xk , yk ) (7.19)
22(k−1)
We can obtain the small size of abstract map for self-localization by reducing the
search space.
We apply a multi-resolution map for global localization. We use two different
types of multi-resolution maps based on the uncertainty and unknownness. By using
these values, we can obtain the state of the cell easily as the following equation,
1 if ok (xk , yk ) > α State

sk (xk , yk ) = (7.20)
0 otherwise
ok (xk , yk ) = n k (xk , yk ) + n Uk nk (xk , yk ) (7.21)
where α State indicates the threshold value. If sk (xk , yk ) is 1, then the state of the
cell means an occupied cell. Figure 7.43 shows an extraction result of occupied
cells drawn in blue at each resolution map. Next, we explain an intelligent self-
localization method. The initial self-localization is done by (μ + λ)-ES because we
mainly need a global search in a large search space. Algorithm 2 shows the procedure
of global localization. The index k indicates the level of the multi-resolution map;
n indicates the number of steps; N indicates the maximal number of generations
(search iterations). In the step 5 and step 12, the weight wil is calculated by the
following equation,
f itil
wil = μ l
(7.22)
j=1 f iti
(a) Original map (b) k = 3 (c) k = 5
Fig. 7.43 An example of multi-resolution maps (Each occupied cell is depicted by blue)
After the initial localization, the self-localization of the robots is done by (μ + 1)-
ES in order to perform the local search. In this way, the robots can estimate the current
position as precisely as possible.
Algorithm 3: Initial self-localization:

Initialization of Algorithm 3:
Step1: Initialize μ parents and n = 0.
Step2: Measurement the LRF data and
Step3: Estimate other robots position according to the pose of each individual.
Step4: -If other robots appear in the sensing range,
then the corresponding LRF data are not used in step 5.
Step5: Calculate fitness value f itil and the weight wil .
Iteration Process:
Step6: Produce offspring depending on wil .
Step7: Measurement the LRF data.
Step8: Estimate other robots position according to the pose of each individual.
Step9: -If other robots appear in the sensing range,
then the corresponding LRF data are not used in step 10.
Step10: Calculate the fitness value f itil .
Step11: The top μ candidates are selected as next parent.
Step12: Calculate the weight wil .
Step13: -If the best fitness value is higher than h,
then k ← k − 1 and n = 0.
-Otherwise, go to step 10.
Step14: -If n > N and the best fitness value is lower than s,
then k ← k + 2 and n = 0.
Step15: -If k = 1, then finish the initial self-localization.
-Otherwise, go to step 6 and
We conduct an experiment of the proposed intelligent self-localization method in

our laboratory. The parameters used for self-localization are shown in the follow-
ing. The numbers of parent candidates (μ) are 1000 and the numbers of offspring
(a) k = 1 (b) k = 2 (c) k = 4 (d) Original map
Fig. 7.44 An experimental result of initial self-localization by (μ+λ)-ES (occupied cell and can-
didate is depicted as orange and pink, respectively)
(a) Resolution Levels (b) Fitness Values
Fig. 7.45 History of resolution level and the best fitness value of (μ+λ)-ES
candidates (λ) are 500; α State = 0.01; α h = 0.75; α s = 0.5; the initial resolution level
k of the multi-resolution map is 2. Figures 7.44 and 7.45 show experimental results
of initial self-localization. In Fig. 7.44a, the candidates spread all over the map in
order to estimate the current robot position (The candidates are drawn by purple).
However, the best fitness value stays low for 20 generations, because the change of
fitness value is very sensitive to the change of the estimated position in the high-
resolution map (k = 2). Therefore, the robot updates the resolution level to k = 4 in
Fig. 7.44c. When the resolution level is low (k = 4), the best fitness value is higher
than 0.9 in Fig. 7.45. In the low-resolution map, it is easy to roughly estimate the
robot position because the low-resolution map has the wide acceptable error range.
By estimating the robot position in the low-resolution map, the best fitness value is
high after downgrading the resolution level. In this way, the robot can estimate the
current position by using the multi-resolution map where the best candidate is drawn
by the red triangle in Fig. 7.44d.
(a) Initial pose of LRSA (b) Without adjustment (c) With adjustment
Fig. 7.46 Concept image of LRSA control system for self-localization
7.2.2.3 SLAM Using 3D Grid Map
In order to reduce computational cost and time, we use three 2D plain maps (x −
y, y − z − y, x − z) cut from 3D environmental map for the self-localization. As a
result of integration of ES-SLAM in three 2D plain maps, we can obtain the position
(x, y, x) of a robot. Furthermore, we can calculate the posture θ yaw , θr oll andθ pitch
from x − y plain map, y − z plain map, and x − z plain map, respectively. If we solve
these three self-localization problems (SLP) using three 2D plain maps separately,
we obtain the redundant solutions. Since a robot moves on the x − y plain map, we
first estimate the current position (x∗, y∗, θ yaw ) of the robot by solving the SLP on
the x − y plain map. Next, we obtain the current position (y∗, z ∗ ∗, θr oll ) by solving
the SLP on the y − z plain map by using the obtained y∗ as a fixed point. Finally,
we estimate the posture θ pitch of using x∗ and z∗ as fixed points. In this way, we can
reduce overall computational cost and time.
Step 1: (x∗, y∗, θ yaw ) by solving SLP(x − y, −)
Step 2: (z ∗ ∗, θr oll ) by solving SLP(y − z, y∗)
Step 3: (θ pitch ) by solving SLP(x − z, x∗, z ∗ ∗)
When we use only 2D plain maps, we should consider a problem shown in

Fig. 7.46. When the robot rotates from (a) to (b) in Fig. 7.46, we have to control
the measurement direction to adjust the direction to its corresponding suitable 2D
plain map shown in Fig. 7.46c. This kind of control like visual servoing is an advan-
tage of LRSA. Figure 7.47 shows one example of self-localization.
7.2.3 Conclusion
This paper explained several methods of simultaneous localization and mapping

(SLAM) for a mobile robot in disaster situations. First, we showed three prototypes
Fig. 7.47 An example of

self-localization
(a) Initial position
(b) After movement
of laser range sensor array (LRSA). The first one (LRSA-1) is a typical prototype
composed of two laser range finders in parallel. The experimental results show that
LRSA-1 can reduce the computational cost and time by using the feature points
calculated by dynamic time warping. The second one (LRSA-2) has two wings
attached with two LRSA-1. Next, we explained how to use evolution strategy (ES)
for self-localization in SLAM (ES-SLAM). We used (μ + 1)-ES and (μ + λ)-ES
local and global localization in SLAM, respectively. The experimental results show
the performance of localization is enough to control a robot in real time. Finally, we
explained the 3D self-localization method using (μ + 1)-ES. The proposed method
can reduce the computational cost and realize the real time tracking. The advantage
of LRSA is in the capability of posture control to measure distance data required
for different set of 2D plain SLAM. As a future work, we will discuss active control
methods of LRSA according to the facing situation. Furthermore, we will develop a
smaller size of LRSA for general purpose of 3D measurement.
7.3 Third-person View by Past Image Records
After a disaster such as an earthquake or flood, rescue robots are required to move
into a destroyed environment to collect information, rescue victims, and complete
other works. In comparison with crawler and wheel type robots, legged robots are
more suitable applications for moving in destroyed terrain, where other robots find it
difficult to maintain balance. There have been several studies on controlling different
types of legged robots when walking across a destroyed environment such as a rocky
terrain [48], steps, ladders [62], and so on. Apart from movement, the legs can be
used to conduct subtle tasks such as object removal, slotting, and opening doors [6].
However, teleoperation is still the most feasible method of controlling legged
robots. Normally, legged robots are operated semi-autonomously. The operators send
the desired velocity or position command to the legged robot, and the controller inside
the legged robot executes the command and maintains balance.
It is difficult for operators to understand the surrounding conditions and remotely
control robots by only collecting information from the on-robot cameras in a
destroyed environment. We believe that operating a robot from a third-person view-
point is an effective solution. Shiroma et al. [53] set a pole at the rear of a robot and
mounted a camera on the top of the pole to realize a third-person viewpoint. The sub-
jects reported that the robot reached the target faster with less crashing. Generating
a third-person viewpoint typically consists of attaching a camera on the top of a pole
such that it looks down upon the robot [52], or using 3D light detection and ranging
(LiDAR) to generate real-time 3D environmental maps in a virtual environment and
then putting the robot Computer Graphics (CG) model into the virtual environment
to show the virtual environment from a third person viewpoint [30].
Limited communication conditions should also be considered when developing
a rescue robot because public communications are partly disrupted after disasters
such as the great east Japan earthquake in 2011. The abovementioned teleoperation
systems are difficult to apply because they require a lot of communication traffic
to be sent from the real-time camera image or pointcloud. The teleoperation inter-
face that uses past images [56] is a rare and suitable solution because it uses a
third-person viewpoint and considers limited communication conditions. It records
images captured in the past by the on-robot camera. Additionally, it also records
the position/orientation of each recorded past image. Then, by using the relationship
of position/orientation between the recorded past images and the current state of
the robot, it generates a synthetic image by overlaying the current CG robot onto a
selected past image. This third-person viewpoint image can display both the current
robot state and the surrounding environment. Moreover, because the recorded images
are not continuous, they occupy less communication resources. The effectiveness of
the past image system has been demonstrated for wheel robots, crawler robots, and
mobile manipulators.
In this study, we improved the past image system and applied it to a legged robot.
Additionally, we removed tilted images by improving the evaluation function. The
tilted camera images are captured because the legged robot moves in 3D and the
Fig. 7.48 Overview of teleoperation system using past image records
ground on which the robot stands is not planar. Therefore, we considered gravity
in the evaluation function to avoid selecting tilted past images because such images
would result in the operators misunderstanding the direction of gravity.
7.3.1 Proposed System
Past image records virtually generate an image from a third-person viewpoint by

overlaying the robot CG model at the corresponding current position on the back-
ground image captured by the camera mounted onto the robot at a previous time.
The system is shown in Fig. 7.48.
The algorithm is as follows:
1. Capture and save the current camera image and the camera position/orientation
into the past image memory storage.
2. Select the best background image that minimized the cost function from the
storage.
3. Generate the robot’s current CG model from the robot’s current joint angles.
4. Merge the robot’s current CG model into the background image.
Evaluation Function
The preexisting system was developed for a mobile robot moving on flat ground,
while the proposed system focuses on a legged robot moving on 3D terrain. A suit-
able background image with the lowest E score was selected from the storage of
previously received images by using the following evaluation function:

z cam − z ideal 2
E = a1 k(z cam − z ideal )
z ideal

L − L ideal 2 β 2
+a2 k(L − L ideal ) + a3
L ideal
2 π/22 (7.23)
α θ
+a4 + a5 ,
φv π/2
10 (x < 0)
k(x) =
1 (x ≥ 0)
where, ai (i = 1, . . . , 5) are the weight coefficients.

The first term of the right-hand side of Eq. (7.23) keeps the height of the viewpoint.
The viewpoint should be placed higher than the robot’s current position to look down
upon it. At the same time, the viewpoint cannot be too high. The viewpoint height
z cam should be similar to the given ideal height z ideal .
The second term keeps a distance between the viewpoint and the robot’s current
position. Here, L is the distance from the viewpoint to the robot’s position, and L ideal
is the given ideal distance.
The third term keeps the viewpoint at the backside of the robot’s current position
and facing the robot. Here, β is the angle between the viewpoint direction and the
robot’s current direction. The ideal situation is β = 0.
The fourth term keeps the robot’s position at the center of the image. The robot
should be placed in the center of the image to display the surroundings widely. Here,
α is the angle between the viewpoint direction and the direction from the robot’s
viewpoint. The ideal situation is α = 0.
The fifth term keeps the sight line horizontal to the ground. The Gravity Refer-
enced View (GRV) [27] is very important for recognizing the gravity direction to
avoid tumbling. Here, θ is the roll angle of the image. The ideal situation is θ = 0.
7.3.2 Platform Setting
Our proposed system is applied to the WAREC-1 legged robot, as shown in Fig. 7.49.
Each leg has seven degrees of freedom (DOF) and six DOF torque sensors. A variable
ranging sensor [22], IMU sensor, and microphone array [49] are mounted onto the
robot for localization. The operator uses a virtual marionette system [24] to operate
all four legs to an assigned position on the ground during the crawling motion [14],
or operates a single leg to an assigned posture. There are two cameras on the robot:
one is mounted onto the front of the robot, and the other is mounted onto the left
front leg. The image size of the two cameras is 640 × 480.
The system structure is shown in Fig. 7.50. The computers are connected to a
local area network (LAN), and Robot Operating System (ROS) [44] architecture is
used to organize the communication. In our proposed system, three computers are
involved. The sensor computer and control computer are mounted onto WAREC-1,
Fig. 7.49 WAREC-1 with sensors
Fig. 7.50 Overview of developed system
and an operator computer is placed at a remote location. The sensor computer sends
back the captured images from the mounted cameras, executes the localization, and
sends back the robot poses. The control computer sends back the joint angles and
torque data from the torque sensors. The proposed system launches on the operator
computer and displays the result on it. The operator monitors the third-person view
by using the past image records, operates the robot by using the operator computer,
and sends the control command to the control computer to activate WAREC-1.
During the communication through the LAN, the two cameras are set to capture
images at a frequency of one frame per 2.0 s. A delay time of 2.0 s is set to simulate a
200 kbps network, which is considered as a narrow bandwidth condition, assuming
a 3G network. Under this condition, a 640 × 480 image (approximately 25 kB) can
be transferred in 1.0 s. Therefore, we set 2.0 s as the delay time for the two cameras.
Fig. 7.51 Robot test field environment
Fig. 7.52 WAREC-1 climbing stage
7.3.3 Field Test Result
The field test was carried out at the robot test field in Fukushima. The field environ-
ment is shown in Fig. 7.51. The robot was assigned to climb up an 86 cm high stage,
rotate a valve using the right front leg, and then climb down the stage. The actual
robot climbing the stage is shown in Fig. 7.52. The proposed system is required to
help the operator operate each leg to climb up the stage.
In the beginning, the robot raises and waves its left front leg with a camera to cap-
ture the surrounding environment. The leg camera images showing the surrounding
environment are pushed into the past image memory storage. Then the robot returns
to its standard crawling motion. The operator uses the virtual marionette system to
operate the robot such that it moves forward to the front of the stage by performing
the crawling motion, while the legs are operated such that the robot climbs up the
(a) Front camera image (b) Leg camera image (c) Proposed method
Fig. 7.53 Field test results
Fig. 7.54 Viewpoint created by our proposed method for the entire climbing up the stage sequence
stage. Here, the third-person view obtained by using the past image records is used
to show the surrounding environment.
Figure 7.53 shows a partial scene of the field test. During the climb, the operator
cannot identify the surrounding environment easily by using a real-time camera image
only. As shown in Fig. 7.53a, the front camera cannot see the stage. Additionally,
as shown in Fig. 7.53b, the operator cannot obtain any useful information from the
leg camera. However, our proposed system can generate an image to show the state
of both legs and the state of the surrounding environment, as shown in Fig. 7.53c.
Moreover, this proves that our proposed system is useful in the scene. The entire
(a) Bipedal style (b) Tripedal style
(c) Quadrupedal style (d) Quadrupedal crawling style
Fig. 7.55 Adaptability for various locomotion styles
climbing up the stage sequence is shown in Fig. 7.54. We confirmed that our proposed
system was able to adapt the robot motion sequence when climbing up the stage.
Legged robots such as WAREC-1 have the feature of the robot’s shape and atti-
tude changing significantly according to the robot’s locomotion and leg movements.
Therefore, to display the robot states appropriately, the system has to adapt to such
changes occurring in the robot. To verify this, we conducted tests by using various
locomotion styles including bipedal, tripedal, quadrupedal, and quadrupedal crawl-
ing styles, as shown in Fig. 7.55.
7.3.4 Integrations
The result of the sound source localization technique [49] for determining the direc-
tion of human voice or other sound source is displayed as arrows on our proposed
system, as shown in Fig. 7.56a. This enables the operator to identify the direction from
where the sound sources originate and from which way to approach these sources.
This can be useful when searching for survivors in disaster sites.
The proposed system was also integrated with the virtual marionette system [24],
which is capable of operating the robot intuitively. The 2D display is placed next to
the robot CG model as shown in Fig. 7.56b.
(a) Overlaying sound source localiza- (b) Implemented on virtual marionette system.
tion result shown as arrows.
Fig. 7.56 Integration with other research institutes
7.3.5 Conclusion
We developed a teleoperation system for a legged robot, which moves in a 3D environ-

ment by using past image records. The proposed system uses the past image records to
generate a third-person view and cope with the communication constraints imposed
on image transmissions during a disaster. Because the posture of the robot changes
greatly while moving, and it is thus difficult for users to recognize the direction of
gravity, the displayed images for teleoperation are selected such that the direction
of gravity is downward. The effectiveness of the proposed system was verified by
using a real legged robot in a robot test field that simulated a disaster site. The results
obtained from the test revealed that the system is effective.
7.4 High-Power Low-Energy Hand
In recent years, robots have been urged to work as alternatives to rescue operation by
people. The working environment is variable and requires adaptive working solutions.
A disaster response robot can move in various areas with irregular ground such as
vertical intervals and ladders, which a crawler robot cannot overcome. Moreover,
removing obstacles, such as crushed rubble, is necessary. The robot should be able
to handle obstacles different in size and shape.
Many multi-fingered robot hands have been developed globally [45]. For example,
the Shadow Dexterous Hand [7] is equipped with absolute position and force sensors
has 20 degrees of freedom. It has four under-actuated joints, which can be actuated
by tendons through motors or a pneumatic air muscle, for 24 joints in total. The
DLR [3] hand has three fingers with four joints each and a thumb. Each finger has
four joints and three degrees of freedom. The thumb is a special mechanism for an
Table 7.10 Developed robot hands
First generation Second generation Third generation

Total length 328 304 308
[mm]
Weight [kg] 1.992 2.341 2.447
Joint number 12 16 16
DOF 12 12 12
Fingertip force 125 150 150
[N]
Torque sensor - - 12
extra degree of freedom for dexterous manipulation and power grasping. The robot
hand can use various tools. Robotiq 3 Finger Gripper [54] has 10 degrees of freedom
using two actuators. It can grasp many objects for industrial use in the market. The
authors have developed anthropomorphic robot hands, which look like human hands
[21, 34, 35]. The robot hands can grasp and manipulate various objects. However,
most humanoid robot hands have a lower fingertip force than the human ones. The
robot has a trade-off relation between the force and size. It is difficult to accomplish
such heavy tasks as lifting obstacles for traditional robot hands.
We developed a novel robot hand having a small size and a large fingertip force.
It can keep the fingertip force high by retention mechanisms without electric power
supply. Experimental results show the effectiveness of the robot hand.
7.4.1 Robot Hand
A disaster response robot should perform various tasks such as removal of obstacles,
usage of human tools, and opening and closing the valves. In the first two years,
three robot hands, which can provide high fingertip force and grasp various objects
dexterously, have been developed as shown in Table 7.10. The robot hand has been
developed cooperating with Adamant Namiki Precision Jewel Co., Ltd.
7.4.1.1 Concepts
This research aims at developing a robot hand, which can be attached to a disaster
response robot for its end effector.
1. The number of fingers is four, so that the robot hand can grasp and manipulate an
object.
2. Each finger has three degrees of freedom because the fingertip can be controlled
in three dimensional positions.
3. The robot hand has thumb opposability between all fingers because it should be
able to deal with an object of a variable size and shape.
4. The robot hand should stand up under the mass of the disaster response robot.
5. The robot hand should enhance energy savings because the power source of the
disaster response robot is a battery.
6. The size of the robot hand should be miniaturized to be like that of the human
hand because it should be able to use human hand tools.
7. Motor driver circuits of the robot hand should be built into the robot hand because
the disaster response robot moves over various areas.
7.4.1.2 Finger Mechanism
It is assumed that a robot is battery-operated at a disaster site. The robot hand has
been developed to have high output force of the fingertip, which uses the ball screw
mechanism. Although the robot can continue grasping an object, a significant amount
of the electric power is needed. The weight of the disaster response robot is assumed to
be larger than 100 [kg]. Because the robot hand should stand up under the weight, one
finger of the robot hand should have an output fingertip force larger than 150 [N]. High
output force can be generated by the ball screw mechanism, which has high-accuracy
positioning and high reduction ratio. The robot hand uses a retention mechanism
without back drivability. The wedge effect of the retention mechanism dyNALOX
[40] can transfer the torque from input to the output torque in one direction. The
function of the retention mechanism is the same as that of a worm gear and a one-
way clutch. However, the retention mechanism has the advantages of small size
and light weight. In order to hold high output force, the surface of the fingers and
palm is covered with an elastic pad. The finger mechanism design of the second
robot hand is shown in Fig. 7.57. The retention mechanism and a ball screw are built
into the finger mechanism. The finger, which has four joints and three degrees of
freedom, can control high output force without electric power supply. The first joint
permits adduction/abduction. The second and third joints permit flexion/extension.
The third joint actuates the fourth joint of the finger through a planar four-bar linkage
mechanism. In the finger mechanism, all joints are modularized and all fingers are
unitized.
Fig. 7.57 Finger mechanism of 2nd robot hand
Fig. 7.58 2nd robot hand
7.4.1.3 Robot Hand
An overview of the developed robot hand is shown in Fig. 7.58. The hand has the
same four fingers. Each finger has four joints with three degrees of freedom. As a
result, the robot hand has 16 joints with 12 degrees of freedom. In order to grasp an
object of a variable size and shape, the fingers are allocated as shown in Fig. 7.59.
The figure shows the range of the movement of the fingertip. One finger can be in
contact with other fingers. Three fingers are aligned for enveloped grasping a stick
object. One finger is faced to three fingers for grasping a small object.
Fig. 7.59 Range of

movement of fingers
304.2
262.8
7.4.1.4 Control System
Twelve DC brushless motors with encoders are installed in the robot hand. Each
motor has 10 wires. The communication wire between the controller and the robot
hand consists of 120 wires. These wires interfere with smooth motion of the robot. To
solve the problem, a compact wire-saving control system [8] is used. The wire-saving
control system for the robot hand consists of an interface FPGA, motor drivers, and
Ethernet. The FPGA circuit shown in Fig. 7.60 has the following functions:
1. To provide the PWM output signal of the 12 drivers for the DC blushless motor.
2. To provide up/down counts for the 12 encoders.
3. To communicate between the FPGA circuit and the control PC by UDP/IP.
The control system is installed on the back of the robot hand. The signals of the
motors and encoders are transferred into the control system and communicated to
the control PC on the LAN. The control PC makes the command of the FPGA circuit
by the position and the fingertip control.
7.4.2 Experiments
A trajectory response of the joint of the finger of the second robot hand by the PD
control is shown in Fig. 7.61. The robot hand can form from opening to closing the
hand in 2 s.
Fig. 7.60 Motor drivers
Fig. 7.61 Responsiveness of 100

2nd joint of 2nd robot hand.
joint angle[deg]
Blue dash line is desired desired

trajectory of joint angle. Red
line is measurement of joint 50
angle actual
0
0 2 4
time[s]
Figure 7.62 shows the fingertip force of the finger driven by the second motor. The
robot hand can generate a force larger than 150 [N]. Figure 7.63 shows the fingertip
force of the finger for 60 s. At the first 10 s, the second motor was provided by electric
power supply. At 60 s, the fingertip force value did not change. Figure 7.64 shows
the first robot hand, which lifted up a sandbag with a mass greater than 50 [kg]. The
robot hand holds the mass without electric power supply. This is the first trial in the
world that the robot hand with a size of 300 [mm] can keep grasping a 50 [kg] object
with a low energy.
Figure 7.65 shows the third robot hand, which made a hole in a concrete plate with
a thickness of 60 [mm]. The robot hand was fixed to the base of the experimental
device, grasped the handle of a chum drill, and manipulated the trigger. The fingers
grasped the handle without electric power supply. The other fingers manipulated
the trigger of the drill. The robot hand accomplished drilling the concrete plate for
15 min.
As a result, the robot hands, which have a small size and a high fingertip force,
can grasp and manipulate a heavy object with energy saving. This means that the
heat problem of motors and electric circuits can be mitigated.
Fig. 7.62 Fingertip force fingertip force

[N]
second joint angle

[rad] third joint angle
[rad]
30.0 20
force
20.0
force [N]
PWM [%]
10
10.0
PWM
0.0 0
0 20 40 60
time [sec]
Fig. 7.63 Fingertip force without electric power. Red line is fingertip force. Blue line is PWM
signal of motor driver
Fig. 7.64 Holding sandbag without electric power supply

Fig. 7.65 Drilling concrete plate
7.4.3 Conclusions
The developed robot hand can grasp and manipulate objects variable in size and shape.
The robot hand can generate high output force, which can be kept with electrical
power saving. This is the first robot hand having a size of 300 [mm] in the world. The
robot hand has a high potential for not only disaster response, but also industrial use.
7.5 Telemanipulation of WAREC-1
Recently, there is an increasing need for robots that can work in dangerous environ-
ments instead of humans. A master–slave-type robot that can be controlled intuitively
by a human operator is desirable in such environments because of its adaptability.
Many remote-controlled robots have been developed [39].
In many of the conventional master-slave systems, bilateral control with force
feedback has been adopted. In bilateral control systems, since contact information
between a slave robot and its environment is fed back to a master system as sensed
forces, the feeling experienced when operating the system is close to the actual feeling
if the environment were being manipulated directly, making such systems suitable
for precise work. However, the master device tends to be complicated, and it is not
often easy for the operator to wear it. Moreover, because its motion range is limited,
it tends to be difficult to perform various work, such as tasks that a human carries
out using both arms.
On the other hand, there are unilateral master-slave systems in which a master
device has only sensors for measuring the operator’s motion. Because the master
Operator’s
Master Device Master-slave control Slave robot
FST/FSG • M • 4-leg robot
TCP
• o
UDP
• • High-power robot
hand, arm, head Force display • Joint angle hand
• • VF, V-PAC, etc. • Stereo camera head
Eye tracker • 3D depth sensor
HMD
Stereo image display

HDMI • Assist by AR
Stereo image
Fig. 7.66 System configuration
device does not have a complicated mechanism, its weight can be kept low and
its size can be reduced. Therefore, it is easy for an operator to wear the master,
and the operator’s workload become small. Moreover, the degree of manipulability
become fine, and high-speed operation also becomes possible. Instead of direct force
feedback, a tele-existence function that presents visual or tactile senses to an operator,
as if it were his own body that is present, is important. Moreover, the operability can
be improved by adding some type of semi-autonomous control to assist an operation.
Based on unilateral control with tele-existence and semi-autonomous assist con-
trol, we have developed a master–slave system that has a lightweight master system
and a high-power slave robot, resulting in fewer restrictions on the operator’s motions,
and showing improved operabitity [31, 37, 38]. This system has been applied to
remote control of the WAREC-1 which is a four-limbed high-power robot for res-
cuing people in disaster environments. In this paper, we explain the details of the
developed system, and its semi-autonomous assist control.
7.5.1 Master-slave System
Figure 7.66 shows the overall structure of the developed system.

A flexible sensor tube (FST) is the name of a sensor system consisting of a multi-
link mechanism with joint angle sensors. FSTs are used to measure the posture of a
human. By connecting an FST controller worn on the back of a human operator to
various body parts, such as the wrists or knees, through FSTs, the relative positions
Fig. 7.67 FST system: (left) FST, (right) FSG
and the relative orientations of the body parts with respect to the torso are calculated
from the joint angles of the FSTs.
The joint angle information is collected at the FST controller, and the positions
and the orientations of the head, the wrists, and the knees are computed. The position
and orientation information is sent to a slave controller through an Ethernet cable or
a wifi network.
In the slave robot, the posture of the operator is transformed to the posture of
the slave robot. The desired joint angles are computed and sent to the positional
controller of the slave robot.
On the other hand, visual information from the stereo vision system and tactile
information of the touch sensors of the slave robot are sent to the operator. The
operator gets the visual information via the head mounted display (HMD), and the
tactile information by force feedback devices on the fingertips.
7.5.1.1 Flexible Sensor Tube (FST)
A prototype of FST was proposed by Osuka et al. [29], and subsequently, commercial
products have been developed by Kyokko Electric, Co. Recently, it has been used as
a master system in some remote control system [9, 29]. One of the main advantages
of the FST is that it is light and flexible, and an operator can move it gently and
quickly. In addition, the movable range of the operator is wider than that of existing
systems, and it is easy to wear the FST system.
Figure 7.67 (left) shows an appearance of a FST system. Each FST consists of links
of 50 mm in length and joints, where the axes of two adjoining joints are orthogonal
to each other. Each joint has a rotation sensor (potentiometer), and there are twist
joint angles every three bending joints.
By integrating the information from the joint angle sensors, the shape of the FST
can be calculated. Because the speed of computation is high, dynamic changes of
the FST shape can be measured.
In our system, we used two FSTs for the head, two for the arms, and two for the
legs, and thus, the total number of FSTs is six. The FST for each arm is 1050 mm
in length, and the number of joints is 23. The FST for the head is 650 mm in length,
and the number of joints is 14.
In such a multi-link type sensor, there is a possibility that the errors of the joints
accumulate, and the errors may become large at the tip of the tube. However, in
actual fact, the error is not so large, and in particular its repeat accuracy is good. This
is because the equilibrium shape of the tube is almost always the same, due to the
effects of gravity, if the end-effector position is the same.
Compared with wireless devices, the FST is robust against disturbances because
the FST is physically connected to the operator’s body. There is also an advantage
that the operator can intuitively feel an operation because of its weight.
7.5.1.2 Flexible Sensor Glove (FSG)
The Flexible Sensor Glove (FSG) is a sensor that measures the shape of the hand
of an operator. A photograph of the FSG is shown in Fig. 7.67 (right). At the back
of each finger, two or three wires are installed. If the joint of the operator’s finger
is bent, the length of the wire is changed, and by measuring the length of the wire
through a linear potentiometer, the value of the joint angle is estimated.
In the fingertips of the thumb, the index finger, and the middle finger, force feed-
back devices called “gravity grabbers” [33] are installed. These devices apply a sense
of touch to the operator’s fingers by tightening a belt in the device.
7.5.1.3 Slave System and Controller
As a slave robot, we used a limb of the WAREC-1 explained in Sect. 7.1. It has the
same structure in each of the four limbs, with a total of 29 degrees of freedom, and it
can realize various moving methods such as bipedal walking, quadrupedal walking,
and ladder ascending and descending. By using two limbs as arms, it is also possible
to perform dexterous manipulation.
In our system, the master device FST can measure the position and orientation
of the operator’s hand. They are sent to the slave robot and are transformed to the
desired joint angles of the slave robot by inverse kinematics. Because of a difference
between an operator’s arm and the slave arm, the inverse kinematics often cannot be
solved, and a singular point posture often occurs. To avoid the problem, we deal with
the inverse kinematics as a nonlinear optimization problem, and solve it by using the
Levenberg-Marquardt (LM) method [31].
7.5.2 Assist Control
In conventional master-slave systems, the dexterity is not sufficient and the mov-
ing speeds are mostly slow because of temporal and spatial errors. The temporal
errors include sensing delays, communication delays, and delays caused by kinetic
differences between the master and slave. The spatial errors include kinematic dif-
ferences between the master and slave, and initial calibration errors between the
master and slave. These result in a low operating speed, making it difficult to use
such master–slave systems for practical uses.
To solve these problem, it is useful to integrate autonomous control in a slave
robot with master-slave control. It is conventionally called “shared control” [42]. In
this paper, we call it “assist control” to emphasis that prediction of human intention
is included. There have been many previous studies on assist control. In [42], they are
classified into three categories: environment-oriented, operator-oriented, and task-
oriented.
To realize a practical assist control system, we integrated several assist modes:
Virtual Fixture (VF), Vision-based Predictive Assist Control (V-PAC). The mode
was changed according to the condition, manually or automatically.
In addition, an assist method called “Scale-Gain Adjustment” is used [19]. This
is a method to changing a scale and gain of the relationship between master and slave
movements considering the limitation of control accuracy and physical fatigue. It is
used to improve the operability of the master-slave system. The details of the method
is explained in [19].
7.5.2.1 Virtual Fixture (VF)
In the case of constraint motions such as a motion along a wall or rotation of a valve,
if some degrees of freedom in the workspace are constrained, the operator may find
it easier to work. In virtual fixtures (VFs) [47], a virtual constraint that restricts some
of the directions in which the slave moves is employed for achieving precise motions.
This is a common method in remote control systems, and several virtual fixtures
are also adopted for some constraint motion in our system. When a slave robot opens
a valve, the robot motion is constrained on the rotation axis of the valve as shown in
Fig. 7.68a. In another example, when a slave robot drills a hole in a wall, a virtual
fixture is given as a plane constraint along the wall as shown in Fig. 7.68b. To drill
a hole accurately, the robot’s motion is constrained in the direction perpendicular to
the wall. These constraints make it easy to perform accurate works.
These virtual fixtures are given manually or autonomously. If the accuracy of 3D
recognition is reliable, a virtual constraint can be autonomously given. The reliability
is shown by a visual display, and the operator can judge whether or not to use the
virtual constraint. If an autonomous virtual fixture is not reliable, it can be set by the
operator manually.
Fig. 7.68 Virtual Fixture: a Valve opening b Drill manipulation
7.5.2.2 Vision-Based Predictive Assist Control (V-PAC) [37]
A reaching-and-grasping motion is one of the most important tasks for manipulation,

serving as an initial motion for all subsequent manipulations. To increase the motion
speed, we propose a new assist control system in which autonomous visual feed-
back control is integrated with master–slave control. We call this the Vision-based
Predictive Assist Control (V-PAC) system.
In V-PAC, a slave robot recognizes candidates of a reaching target on the basis
of visual information acquired beforehand. Then, the reaching target is predicted
from the initial motion of a reaching motion and the gaze direction. At the same
time as the reaching motion is predicted, the slave reaching motion is modified by
visual feedback control. This assist control is started only in the case where the
operator’s reaching motion is sufficiently quick, thus compensating for temporal and
spatial errors. As a result, a quick and precise reaching motion is achieved for a
master-slave system.
The flow of the proposed V-PAC is shown in Fig. 7.69 (left). It consists of the
following processes.
A Prior estimation of candidates of a grasp target. The candidates of a grasp target
are observed using the slave’s vision system.
B Detection of reaching motion and the grasp target (Fig. 7.69 (left (a))). First, the
master’s hand motion is predicted using a particle filter. Next, a grasp likelihood
is computed based on the prediction of the master’s hand motion and the master’s
gaze direction.
C Prediction of the reaching motion (Fig. 7.69 (left (b))). The master’s reaching
motion is predicted using a Minimum Jerk (MJ) Model.
D Assist control for slave’s reaching motion (Fig. 7.69 (left (c))). The slave’s
reaching motion is generated by modifying the master’s reaching motion so as
to reach the grasp target.
Slave robot Master operator
Gaze FST candidates of grasp

Master Device target
of reaching List of candidates
of operator
Grasp likelihood
fixing of grasp target
of grasp target
D. Assist control of
of slave robot
Slave Robot
Fig. 7.69 a Concept of V-PAC b Data flow of the proposed reaching assist. [37]
Processes (B)–(D) are repeated in each control cycle. Also the grasp target is repeat-
edly observed with the vision system in each cycle so that the correct grasp is achieved
if the target is moved.
The important point of the algorithm is to integrate visual feedback control. This
is modified on the basis of the human master’s motion and stops after a small time
lag if the master stops. Therefore, it is not completely autonomous control but “semi-
autonomous” control.
In process D, the grasp position is modified to the target position observed by the
vision system. At the same time, the slave robot’s motion is modified so that the end
time of the slave’s grasp becomes the same as that of the master’s grasp. This means
that the slave’s hand catches up with the master’s hand during the reaching motion.
This is not achieved if the maximum speed of the slave’s hand is too low or if the
communication delay is large. However, there is an advantage that the operability is
improved by improving the response even if the slave robot is slow.
Fig. 7.70 Valve rotation: a preparing b reaching with free motion, (lower left) switching to VF,
(lower right) rotation along VF
7.5.3 Experiment
7.5.3.1 Valve Rotation by Using Virtual Fixture
The first task was valve rotation. The robot used one limb as a manipulator, and other
three limbs were used to maintain the posture of the body.
The experimental result is shown as a sequence of continuous photos in Fig. 7.70.
First, the robot prepared to operate a valve as shown in (a). Next, the robot inserted
its end-effector to a hole between spokes of the hand-wheel as shown in (b). In this
phase, the gain-adjustment strategy [19] was used. Then, the control mode of the
robot was switched from the the free operation mode to the VF mode as shown in
(c). The motion of the end-effector was constrained on a circular motion. Finally, the
hand-wheel was rotated along the VF as shown in (d). As a result, smooth rotation
was achieved.
7.5.3.2 Drill Manipulation by Using Virtual Fixture
In this task, a high-power multi-fingered robot hand explained in Sect. 7.4 was
mounted at the end of the limb. It is a robot hand having 12 degrees of freedom,
with three fingers and a thumb facing each other, and it is possible to output a force
of 125 N per fingertip. By using a non-energized lock mechanism, the robot hand can
Fig. 7.71 Drill manipulation: a preparing b switching to VF c switching on the trigger d drilling
along VF
keep grasping without being supplied with power. The experimental result is shown
as a sequence of continuous photos in Fig. 7.71.
First, the drill was grasped by the hand as shown in (a). Next, the mode was
switched to the VF, and the robot motion was constrained to the direction orthogonal
to the plane as shown in (b). Then, one finger was used to switch on the drill rotation
as shown in (c). Finally, the drill made a hole as shown in (d). By using the VF,
smooth and accurate drill operation was achieved.
7.5.3.3 Reaching by Using V-PAC
In this section, we describe simulation results performed to verify the proposed

method for a one-handed reaching motion.
Figure 7.72 shows the result of process A, that is, prior estimation of the candidates
of the grasp target. Figure 7.73 shows sequential photos of this motion from process
B to D. First, an operator reached his hand towards the target, and lifted up the target
after grasping it. The orange tubes shows the shape of the FST, and it is shown
that the trajectory of the hand was modified to the position recognized by the 3D
measurement by the assist control.
This is currently verified by using the actual WAREC-1.
Fig. 7.72 Object recognition: a experimental setup b 3D recognition of objects
Fig. 7.73 Sequential photos of reaching and lifting. Reference [37]: a preparing b reaching c
grasping d target was completely grasped
7.5.4 Conclusion
We have proposed a lightweight master system of the WAREC-1 and an assist control
system for improving the maneuverability of master–slave systems. In the lightweight
master system, we proposed a Flexible Sensor Tube (FST) system, which is a multi-
link mechanism with joint angle sensors, and we show the effectiveness of the system
for a disaster response robot.
Next, we showed an assist control system for our master-slave system, in which
some autonomous control helps a remote control operation. Our assist control system
consists of virtual fixtures (VFs) and Vision-based Prediction Assist Control (V-
PAC). In VF constraints are given manually or autonomously according to a change
of an environment. V-PAC consists of visual object recognition for the slave robot,
prediction of the operator’s motion using a particle filter, estimation of the operator’s
intention, and operation assistance of the reaching motion. The effectiveness of the
proposed system was verified by experiments.
7.6 Conclusion
This chapter introduces a novel four-limbed robot, WAREC-1, which has identically
structured limbs with 28-DoFs in total with 7-DoFs in each limb. WAREC-1 has var-
ious locomotion styles, such as bipedal/quadrupedal walking, crawling, and ladder
climbing to move in various types of environments such as rough terrain with rubble
piles, narrow places, stairs, and vertical ladders. Main contributions of this chapter
are following five topics: (1) Development of a four-limbed robot, WAREC-1. (2)
SLAM using laser range sensor array. (3) Teleoperation system using past image
records to generate a third-person view. (4) High-power and low-energy hand. (5)
Lightweight master system for telemanipulation and an assist control system for
Fig. 7.74 WAREC-1 opening a valve with an opening torque of 90 Nm by telemanipulation

improving the maneuverability of master-slave systems. By integrating the intro-

duced technology into WAREC-1, we successfully moved WAREC-1 closer to the
valve by teleoperation and made WAREC-1 turn the valve with an opening torque
of 90 Nm by telemanipulation (see Fig. 7.74).
In the future, we will aim to realize a legged robot with advanced locomotion and
manipulation capability by integrating the technologies that could not be introduced
in this chapter into WAREC-1.
Acknowledgements This study was conducted with the support of Research Institute for Science
and Engineering, Waseda University; Future Robotics Organization, Waseda University, and as a part
of the humanoid project at the Humanoid Robotics Institute, Waseda University. This research was
partially supported by SolidWorks Japan K. K; DYDEN Corporation; and KITO Corporation. This
work was supported by Impulsing Paradigm Change through Disruptive Technologies (ImPACT)
Tough Robotics Challenge program of Japan Science and Technology (JST) Agency.
References
1. Asama, H., Tadokoro, S., Setoya, H.: In: COCN (Council on Competitiveness-Nippon) Project
on Robot Technology Development and Management for Disaster Response. IEEE Region 10
Humanitarian Technology Conference 2013 (2013)
2. Boston Dynamics (2018). http://www.bostondynamics.com/robot_Atlas. Accessed 1 Mar 2018
3. Butterfas, J., Grebenstein, M., Liu, H.: DLR-hand II: next generation of a dextrous robot hand.
In: Proceedings of 2001 IEEE International Conference on Robotics and Automation, vol. 1,
pp. 109–114 (2001)
4. Council on Competitiveness Nippon (COCN).: Establishment plan for a disaster response robot
center. The 2013 Report of Council on Competitiveness Nippon (COCN) (2013)
5. Darpa Robotics Challenge Finals (2015).https://web.archive.org/web/20160428005028/http://
www.darparoboticschallenge.org. Accessed 1 Mar 2018
6. Dellin, C.M., Strabala, K., Haynes, G.C., Stager, D., Srinivasa, S.S.: Guided manipulation
planning at the darpa robotics challenge trials. Experimental Robotics, pp. 149–163. Springer,
Cham (2016)
7. Dexterous Hand - Shadow Robot Company (2018). https://www.shadowrobot.com/products/
dexterous-hand/. Accesssed 1 Aug 2018
8. Endo, T., Kawasaki, H., Mouri, T., Ishigure, Y., Shimomura, H., Matsumura, M., Koketsu, K.:
Five-fingered haptic interface robot: HIRO III. IEEE Trans. Haptics 4(1), 14–27 (2011)
9. Fernando, C. L., Furukawa, M., Kurogi, T., Kamuro, S., Minamizawa, K., Tachi, S.: Design
of TELESAR V for transferring bodily consciousness in telexistence. In: 2012 IEEE/RSJ
International Conference on Intelligent Robots and Systems, pp. 5112–5118 (2012)
10. Fogel, D.B.: Evolutionary Computation, IEEE Press (1995)
11. Fujii, S., Inoue, K., Takubo, T., Mae, Y., Arai, T.: Ladder climbing control for limb mecha-
nism robot ‘ASTERISK’. In: Proceedings of IEEE International Conference on Robotics and
Automation, pp. 3052–3057 (2008)
12. Fukuda, T., Hasegawa, Y., Doi, M., Asano, Y.: Multi-locomotion robot-energy-based motion
control for dexterous brachiation. In: Proceedings of IEEE International Conference of Robotics
and Biomimetics, pp. 4–9 (2005)
13. Hashimoto, K., Kondo, H., Lim, H.O., Takanishi, A.: Online walking pattern generation using
FFT for humanoid robots. Motion and Operation Planning of Robotic Systems. Background
and Practical Approaches, pp. 417–438. Springer International Publishing, Berlin (2015)
14. Hashimoto, K., Koizumi, A., Matsuzawa, T., Sun, X., Hamamoto, S., Teramachi, T., Sakai, N.,
Kimura, S., Takanishi, A.: Development of disaster response robot for extreme environments: –
4th report: proposition of crawling motion for four-limbed robot–(in Japanese). In: Proceedings
of 2016 JSME Annual Conference on Robotics and Mechatronics (Robomec), pp. 1A2–09b7
(2016)
15. Hirose, S., Tsukagoshi, H., Yoneda, K.: Normalized energy stability margin and its contour
of walking vehicles on rough terrain. In: Proceedings of IEEE International Conference on
Robotics and Automation, pp. 181–186 (2001)
16. Iida, H., Hozumi, H., Nakayama, R.: Developement of ladder climbing robot LCR-1. J. Robot.
Machatron. 1, 311–316 (1989)
17. Jacoff, A., Messina, E.: Urban search and rescue robot performance standards progress update.
In: Proceedings of SPIE 6561, Unmanned Systems Technology IX, vol. 65611L, pp. 29–34.
(2007)
18. Jacoff, A., Downs, A., Virts, A., Messina, E.: Stepfield pallets: repeatable terrain for evaluating
robot mobility. In Proceedings of the 8th Workshop on Performance Metrics for Intelligent
Systems, pp. 29–34 (2008)
19. Kamezaki, M., Eto, T., Sato, R., Iwata, H.: A scale-gain adjustment method for master-slave
system considering complexity, continuity, and time limitation in disaster response work. In:
JSME Robotics and Mechatronics Conference, pp. 2A2–M02 (2018) (in Japanese)
20. Karumanchi, S., Edelberg, K., Baldwin, I., Nash, J., Reid, J., Bergh, C., Leichty, J., Carpenter,
K., Shekels, M., Gildner, M., Newill-Smith, D., Carlton, J., Koehler, J., Dobreva, T., Frost, M.,
Hebert, P., Borders, J., Ma, J., Douillard, B., Backes, P., Kennedy, B., Satzinger, B., Lau, C.,
Byl, K., Shankar, K., Burdick, J.: Team RoboSimian: semi-autonomous mobile manipulation
at the 2015 DARPA robotics challenge finals. J. Field Robot. 34(2), 305–332 (2016)
21. Kawasaki, H., Komatsu, T., Uchiyama, K.: Dexterous anthropomorphic robot hand with dis-
tributed tactile sensor: gifu hand II. IEEE/ASME Trans. Mechatron. 7(3), 296–303 (2002)
22. Kitai, S., Toda, Y., Takesue, N., WADA, K., Kubota, N.: Intelligential control of variable sokuiki
sensor array for environmental sensing (in Japanese). In: 2017 JSME Annual Conference on
Robotics and Mechatronics (Robomec), pp. 1P1–Q06 (2017)
23. Kitai, S., Toda, Y., Takesue, N., Wada, K., Kubota, N.: Intelligent control of variable ranging
sensor array using multi-objective behavior coordination. Intelligent Robotics and Applica-
tions, ICIRA 2018. Lecture Notes in Computer Science, vol. 10984. Springer, Berlin (2018)
24. Kitani M., Asami, R., Sawai Y., Sato N., Fujiwara T., Endo T., Matuno F., Morita Y.: Tele-
operation for legged robot by virtual marionette system (in Japanese). In: 2017 JSME Annual
Conference on Robotics and Mechatronics (Robomec), pp. 1P1–Q03 (2017)
25. Kojima, K., Karasawa, T., Kozuki, T., Kuroiwa, E., Yukizaki, S., Iwaishi, S., Ishikawa, T.,
Koyama, R., Noda, S., Sugai, F., Nozawa,S., Kakiuchi, Y., Okada, K., Inaba, M.: Development
of life-sized high-power humanoid robot JAXON for real-world use. In: Proceedings of IEEE-
RAS International Conference on Humanoid Robots, pp. 838–843 (2015)
26. Kondak, K, Huber, F., Schwarzbach, M., Laiacker, M., Sommer, D., Bejar, M., Ollero, A.:
Aerial manipulation robot composed of anautonomous helicopter and a 7 degrees of freedom
industrial manipulator. In: Proceedings of IEEE International Conference on Robotics and
Automation, pp. 2107–2112 (2014)
27. Lewis, M., Wang, J., Hughes, S., Liu, X.: Experiments with attitude: attitude displays for
teleoperation. In: 2003 IEEE International Conference on Systems, Man and Cybernetics, pp.
1345–1349 (2003)
28. Lim, J., Lee, I., Shim, I., Jung, H., Joe, H.M., Bae, H., Sim, O., Oh, J., Jung, T., Shin, S., Joo,
K., Kim, M., Lee, K., Bok, Y., Choi, D.G., Cho, B., Kim, S., Heo, J., Kim, I., Lee, J., Kwon,
I.S., Oh, J.H.: Robot system of DRC-HUBO+ and control strategy of team KAIST in DARPA
robotics challenge finals. J. Field Robot. 34(4), 802–829 (2017)
29. Maeda, K., Osuka, K.: Error analysis of FST for accuracy improvement. In: Proceedings of
SICE Annual Conference, pp. 1698–1700 (2010)
30. Marion, P., Fallon, M., Deits, R., Valenzuela, A., D’Arpino, C.P., Izatt, G., Manuelli, L., Antone,
M., Dai, H., Koolen, T., Carter, J., Kuindersma, S., Tedrake, R.: Director: a user interface
designed for robot operation with shared autonomy. J. Field Robot. 34(2), 262–280 (2017)
31. Matsumoto, Y., Namiki, A., Negishi, K.: Development of a safe and operable teleoperated
robot system controlled with a lightweight and high-operable master device. In: Proceedings
of IEEE/SICE International Symposium System Integration, pp. 552–557 (2015)
32. Matsuzawa, T., Koizumi, A., Hashimoto, K., Sun, X., Hamamoto, S., Teramachi, T., Kimura,
S., Sakai, N., Takanishi, A.: Crawling gait for four-limbed robot and simulation on uneven
terrain. In: Proceedings of the 16th IEEE-RAS International Conference on Humanoid Robots,
pp. 1270–1275 (2016)
33. Minamizawa, K., Fukamachi, S., Kajimoto, H., Kawakami, N., Tachi, S.: Gravity grabber:
wearable haptic display to present virtual mass sensation. In: ACM SIGGRAPH (2007)
34. Mouri, T., Kawasaki, H., Yoshikawa, K., Takai, J., Ito, S.: Anthropomorphic Robot Hand: Gifu
Hand III. In: Proceedings of the 2002 International Conference on Control, Automation and
Systems, pp. 1288–1293 (2002)
35. Mouri, T., Kawasaki, H.: Humanoid robots human-like machines. A Novel Anthropomorphic
Robot Hand and its Master Slave System, pp. 29–42. I-Tech Education and Publishing (2007)
36. Nakano, E., Nagasaka, S.: Leg-wheel robot: a futuristic mobile platform for forestry industry.
In: Proceedings of IEEE/Tsukuba International Workshop on Advanced Robotics, pp. 109–112
(1993)
37. Namiki, A., Matsumoto, Y., Liu, Y., Maruyama, T.: Vision-based predictive assist control on
master-slave systems. In: Proceedings of IEEE International Conference Robotics and Automa-
tion, pp. 5357–5362 (2017)
38. Negishi, K., Liu, Y., Maruyama, T., Matsumoto, Y., Namiki, A.: Operation assistance using
visual feedback with considering human intention on master-slave systems. In: Proceedings of
IEEE International Conference Robotics and Biomimetics, pp. 2169–2174 (2016)
39. Niemeyer, G., Preusche, C., Hirzinger, G.: Telerobotics. Springer Handbook of Robotics, pp.
741–757. Springer, Berlin (2008)
40. No Electricity Locking System — Adamant Namiki Precision Jewel Co., Ltd (2018). https://
www.ad-na.com/en/product/dccorelessmotor/dynalox.html. Accessed 1 Aug 2018
41. Ohmichi, T., Ibe, T.: Development of vehicle with legs and wheels. J. Robot. Soc. Jpn. 2(3),
244–251 (1984)
42. Passenberg, C., Peer, A., Buss, M.: A survey of environment-, operator-, and task-adapted
controllers for teleoperation systems. Mechatronics 20(7), 787–801 (2010)
43. Pounds, P., Bersak, D., Dollar, A.: Grasping from the air: hovering capture and load stability. In:
Proceedings of IEEE International Conference on Robotics and Automation, pp. 2491–2498
(2011)
44. Quigley, M., Conley, K., Gerkey, B.P., Faust, J., Foote, T., Leibs, J., Wheeler, R., Ng, A.Y.:
ROS: an open-source robot operating system. In: ICRA Workshop on Open Source Software,
vol. 3(3.2), p. 5 (2009)
45. Ramirez Rebollo, D.R., Ponce, P., Molina, A.: From 3 fingers to 5 fingers dexterous hands.
Adv. Robot. 31(19–20), 1051–1070 (2017)
46. Rechenberg, I.: Evolutionsstrategie: Optimierung Technischer Systeme nach Prinzipien der
Biologischen Evolution. FrommannHolzboog Verlag, Stuttgart (1973)
47. Rosenberg, L.B.: Virtual fixtures: perceptual tools for telerobotic manipulation. In: Virtual
Reality Annual International Symposium, pp. 76–82. IEEE (1993)
48. Rebula, J.R., Neuhaus, P.D., Bonnlander, B.V., Johnson, M.J., Pratt,J.E.: A controller for the
littledog quadruped walking on rough terrain. In: 2007 IEEE International Conference on
Robotics and Automation (ICRA), pp. 1467–1473 (2007)
49. Sasaki, Y., Tanabe, R., Takemura, H.: Probabilistic 3D sound source mapping using moving
microphone array. In: 2016 IEEE/RSJ International Conference on Intelligent Robots and
Systems (IROS), pp. 1293–1298 (2016)
50. Schwarz, M., Rodehutskors, T., Droeschel, D., Beul, M., Schreiber, M., Araslanov, N., Ivanov,
I., Lenz, C., Razlaw, J., Schüller, S., Schwarz, D., Topalidou-Kyniazopoulou, A., Behnke, S.:
NimbRo rescue: solving disaster-response tasks with the mobile manipulation robot momaro.
J. Field Robot. 34(2), 400–425 (2016)
51. Schwefel, H.-P.: Kybernetische evolution als strategie der experimentellen forschung in der
strmungstechnik. Diploma thesis, Technical University of Berlin (1965)
52. Shirasaka, S., Machida, T., Igarashi, H., Suzuki, S., Kakikura, M.: Leg selectable interface for
walking robots on irregular terrain. In: 2006 SICE-ICASE International Joint Conference, pp.
4780–4785 (2006)
53. Shiroma, N., Sato, N., Chiu, Y., Matsuno, F.,: Study on effective camera images for mobile
robot teleoperation. In: 2004 IEEE International Workshop on Robot and Human Interactive
Communication (RO-MAN), pp. 107–112 (2004)
54. Start Production Faster - Robotiq (2018). https://robotiq.com/. Accessed 1 Aug 2018
55. Stentz, A., Herman, H., Kelly, A., Meyhofer, E., Haynes, G.C., Stager, D., Zajac, B., Bagnell,
J.A., Brindza, J., Dellin, C., George, M., Gonzalez-Mora, J., Hyde, S., Jones, M., Laverne, M.,
Likhachev, M., Lister, L., Powers, M., Ramos, O., Ray, J., Rice, D., Scheifflee, J., Sidki, R.,
Srinivasa, S., Strabala, K., Tardif, J.P., Valois, J.S., Weghe, J.M.V., Wagner, M., Wellington, C.:
CHIMP, the CMU highly intelligent mobile platform. J. Field Robot. 32(2), 209–228 (2015)
56. Sugimoto, M., Kagotani, G., Nii, H., Shiroma, N., Inami, M., Matsuno, F.: Time follower
vision: a teleoperation interface with past images. IEEE Comput. Graph. Appl. 25(1), 54–63
(2005)
57. Sun, X., Hashimoto, K., Hamamoto, S., Koizumi, A., Matsuzawa, T.,Teramachi, T., Takanishi,
A.: Trajectory generation for ladder climbing motion with separated path and time planning.
In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, pp.
5782–5788 (2016)
58. Thrun, S., Burgard, W., Fox, D.: Probabilistic Robotics. MIT Press, Cambridge (2005)
59. Tsagarakis, N.G., Caldwell, D.G., Negrello, F., Choi, W., Baccelliere, L., Loc, V., Noorden, J.,
Muratore, L., Margan, A., Cardellino, A., Natale, L., Mingo Homan, E., Dallali, H., Kashiri,
N., Malzahn, J., Lee, J., Kryczka, P., Kanoulas, D., Garabini, M., Catalano, M., Ferrati, M., Var-
ricchio, V., Pallottino, L., Pavan, C., Bicchi, A., Settimi, A., Rocchi, A., Ajoudani, A.: WALK-
MAN: a high-performance humanoid platform for realistic environments. J. Field Robot. 34(7),
1225–1259 (2017)
60. Vaillant, J., Kheddar, A., Audren, H., Keith, F., Brossette, S., Escande, A., Kaneko, K., Mori-
sawa, M., Gergondet, P., Yoshida, E., Kajita, S., Kenehiro, F.: Multi-contact vertical ladder
climbing with a HRP-2 humanoid. Auton. Robot. 40(3), 561–580 (2016)
61. Yamauchi, B.: PackBot: a versatile platform for military robotics. In: Defense and Security,
International Society for Optics and Photonics, pp. 228–237 (2004)
62. Yoneda, H., Sekiyama, K., Hasegawa, Y., Fukuda, T.: Vertical ladder climbing motion with
posture control for multi-locomotion robot. In: 2008 IEEE/RSJ International Conference on
Intelligent Robots and Systems (IROS), pp. 3579–3684 (2008)
63. Yoshida, T., Nagatani, K., Tadokoro, S., Nishimura, T., Koyanagi, E.: Improvements to the
rescue robot quince toward future indoor surveillance missions in the fukushima daiichi nuclear
power plant. Field and Service Robotics, vol. 92, pp. 19–32. Springer, Berlin (2013)
64. Yoshiike, T., Kuroda, M., Ujino, R., Kaneko, H., Higuchi, H., Iwasaki, S., Kanemoto, Y., Asa-
tani, M., Koshiishi, T.: Development of experimental legged robot for inspection and disaster
response in plants. In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots
and Systems, pp. 4869–4876 (2017)
65. Yoshikawa, T.: Analysis and control of robot manipulators with redundancy. In: Proceedings
of Robotics Research: The First International Symposium, pp. 735–747 (1984)
Part IV
Component Technologies
Chapter 8
New Hydraulic Components for Tough
Robots
Koichi Suzumori, Hiroyuki Nabae, Ryo Sakurai, Takefumi Kanda, Sang-Ho

Hyon, Tohru Ide, Kiyohiro Hioki, Kazu Ito, Kiyoshi Inoue, Yoshiharu Hirota,
Akina Yamamoto, Takahiro Ukida, Ryusuke Morita, Morizo Hemmi, Shingo
Ohno, Norihisa Seno, Hayato Osaki, Shoki Ofuji, Harutsugu Mizui, Yuki
Taniai, Sumihito Tanimoto, Shota Asao, Ahmad Athif Mohd Faudzi, Yohta
Yamamoto and Satoshi Tadokoro
Abstract Hydraulic components have tremendous potential for realizing “tough

robots” owing to their “tough features,” including high power density and shock re-
sistance, although their practical robotic usage faces some challenges. This chapter
explains a series of studies on hydraulic robot components, focusing on high output
density, large generative force, shock resistance, and environmental resistance to in-
vestigate reducing size, increasing intelligence, lowering weight, achieving multiple
degrees of freedom, and lowering sliding friction. The studies are based on past hy-
draulics technologies with the aim of permitting hydraulic actuator technologies to
take important roles in achieving tough robots to operate at disaster sites and under
other extreme environments. The studies consist of research and development of
K. Suzumori (B) · H. Nabae · T. Ide · K. Inoue · Y. Hirota · A. Yamamoto

T. Ukida · R. Morita · M. Hemmi · A. A. M. Faudzi · Y. Yamamoto
Tokyo Institute of Technology, Tokyo, Japan
e-mail: suzumori@mes.titech.ac.jp
H. Nabae
e-mail: nabae@mes.titech.ac.jp
T. Ide
e-mail: ide.t.ad@m.titech.ac.jp
K. Inoue
e-mail: inoue.k.av@m.titech.ac.jp
Y. Hirota
e-mail: hirota.y.ad@m.titech.ac.jp
A. Yamamoto
e-mail: yamamoto.a.ag@m.titech.ac.jp
T. Ukida
e-mail: ukida.t.ab@m.titech.ac.jp
R. Sakurai · S. Ohno
Bridgestone Corporation, Tokyo, Japan
e-mail: ryo.sakurai1@bridgestone.com
S. Ohno
e-mail: shingo.oono@bridgestone.com
https://doi.org/10.1007/978-3-030-05321-5_8
402 K. Suzumori et al.
compact, lightweight, and high-output actuators; rotating high-torque motors; low-

sliding cylinders and motors; power packs; high-output McKibben artificial muscles;
particle-excitation-type control valves; hybrid boosters; and hydraulic control sys-
tems to be undertaken along with research on their application to tough robots.
8.1 Overview of Hydraulic Components
8.1.1 History of Hydraulic Robots
Most modern robots are driven by electromagnetic motors, but the world’s first indus-
trial robots, which appeared in 1961, were hydraulically driven. This is mainly be-
cause at that time, electromagnetic motors were weak, and for the next approximately
20 years, not only industrial robots but almost all robots were driven by hydraulic
actuators. However, the situation changed in the 1980s. When the development of
electromagnetic motors–such as rare-earth motors or brushless motors, which ad-
vanced to the practical stage in the 1980s–resulted in electromagnetic actuators with
high power density, hydraulic actuators lost their position as the leading robot-use ac-
tuators. Convenient and easy-to-use electromagnetic motors started being treasured
as the solution to problems related to hydraulic actuators, such as the difficulty of
sensitive force control or precision positioning control, operating oil leakage, trou-
blesome oil-quality control, arranging hydraulic hoses, high cost of valves, and in
later mobile robots, in particularly hydraulic cost. Ultimately, hydraulic robots, ex-
cept for deep-sea exploration robots and others with special purposes, disappeared.
In recent years, however, hydraulic actuators’ potential as robot-use actuators has
been reconsidered [32]. One typical example is a series of robots with hydraulic
legs–BigDog, Wildcat, and Atlas–developed by the American company, Boston Dy-
namics [28]. Many videos of these robots went viral on the Internet, and as a result,
numerous people now realized the new potential of hydraulic robots owing to their
properties such as dynamism, power, good shock resistance and controllability, and
T. Kanda · N. Seno · H. Osaki · S. Ofuji

Okayama University, Okayama, Japan
e-mail: kanda-t@okayama-u.ac.jp
S.-H. Hyon · H. Mizui · Y. Taniai · S. Tanimoto · S. Asao
Ritsumeikan University, Kyoto, Japan
e-mail: sangho@ieee.org
K. Hioki
JPN Co., Ltd., Tokyo, Japan
e-mail: hioki@j-p-n.co.jp
K. Ito
KYB Co., Ltd., Tokyo, Japan
e-mail: itou-kazu@kyb.co.jp
S. Tadokoro
8 New Hydraulic Components for Tough Robots 403
their maneuverability, which includes backward somersaults. Many of these legged

robots from Boston Dynamics are operated by hydraulic actuators. Considering the
output/self-weight ratio of actuators, hydraulic actuators are still superior and their
controllability is definitely not inferior to that of electromagnetic actuators. Addi-
tionally, various hydraulic robots have been developed since the beginning of the
twenty-first century. Hydraulic-legged walking robots are being developed globally
by IIT and other companies besides Boston Dynamics [32]. A hydraulic robot played
an active part in the Tom Cruise movie, Last Samurai (2003). During a battle scene,
the horse Tom Cruise was riding was struck by an arrow, and after the horse collapsed
dynamically, it died in agony. It is reported that this triggered a temporary storm of
criticism from people who thought a horse had been killed, but the horse was actu-
ally a hydraulically driven robot. The dynamic movement and precise performance
could be simultaneously achieved using hydraulic actuators, which are endowed with
both power and controllability. Robots such as those intended for rescue work are
also equipped with many hydraulic actuators. For example, the two-armed hydraulic
robot ASTACO, developed by Hitachi Construction Machinery Co., Ltd. [12], and
T-52 Enryu, which is a large rescue robot developed by Tmsuk Co., Ltd. [1], are
hydraulically driven. Figures 8.1 and 8.2 show the Jack-up Robot [33] and Cutter
Robot [19] developed by us.
The Jack-up Robot is developed to enter collapsed rubble and jack it up to rescue
victims. It is a small robot, but can jack up a load greater than 3 tons using a 71 MPa
hydraulic cylinder. If it encounters steel rebars or similar material in the rubble, it
cannot jack it up, so these materials must first be cut. The Cutter Robot was developed
Fig. 8.1 Cutter Robot (left) and Jack-up Robot (right) [33]
Fig. 8.2 Cutter Robot cutting wood [19]
for this purpose, which too is driven hydraulically at 71 MPa. Figure 8.3 shows an
example of a shape-adaptable-type robot driven manually by the hydraulic McKibben
Artificial Muscle [20]. Thus, since the beginning of the twenty-first century, various
hydraulic robots have been developed, and efforts have been made to realize hydraulic
robots with characteristics impossible to achieve using electromagnetic actuators.
8.1.2 Expectations and Technical Challenges Related to

Hydraulic Robots
Hydraulic actuators are, themselves, compact, but they need a heavy pump to create
the high-pressure oil required to drive them and hydraulic hoses to connect them to
the pump, imposing major restrictions on their use. However, Big Dog is equipped
with an engine-driven hydraulic pump, which operates independently. This is an
example of a good verification that can be achieved with a compact mechanical
system according to its design. Table 8.1 concretely summarizes the potential and
challenges of hydraulic actuators.
The major benefit of a hydraulic actuator is “(1) High power density (force pro-
duced per unit volume of the actuator).” However, regarding “(2) High output den-
sity (=work rate per unit volume),” the way this is considered differs depending on
whether (i) only the device that moves the robot (cylinder for example) is treated
as the actuator or (ii) the actuator is assumed to include the power pack (hydraulic
Fig. 8.3 Shape-adaptable robot manually driven by hydraulic artificial muscle [20]
Table 8.1 Hydraulic actuators from the perspective of robots

Merits
(1) High power density (force produced per unit volume of the actuator)
(2) High output density (case of a stationary robot)
(3) Shock resistance
(4) Efficient energy management
(5) Achievement of lock/free states
Challenges faced in existing hydraulic actuators
(6) Hydraulic seal → oil leakage, declining controllability (caused by friction of sliding parts)
(7) Hydraulic power source (heavy, low efficiency)
(8) Unsuited for multi-degree-of-freedom performance (cost of control valves, piping)
(9) Not designed for robot use (heavy, large, few oscillation type)
pump and motor that drives the pump). It is probably realistic to classify the power
pack in the case of a stationary robot arm (i) by treating it as a type of exterior energy
source, and to classify it as (ii) in the case of a self-contained mobile robot with an
internal power pack. In other words, using hydraulic actuators can be counted on to
achieve stationary robot arms with extremely high (1) power density and (2) output
density, but in a self-contained mobile robot, it is not necessarily possible to count on
“(2) high output density.” “(3) Shock resistance” is another characteristic attractive
to robot designers. A video of the WildCat from Boston Dynamics mentioned above
includes scenes of the robot dynamically running and tumbling, where Wildcat con-
tinues to run regardless of the violent impact on its legs. With electric motor drives, it
has, so far, been difficult to allow violent motion out of fear that their reduction gears
would be damaged. Regarding “(4) Efficient energy management,” let us consider a
four-legged robot demanding maximum output of 100 W for each leg as an example.
If it used electric motors, a 100 W motor would be installed on each leg, which means
that the motors with a total output of 400 W output would be installed on the robot.
However, in reality, all legs rarely generate the maximum output simultaneously, so
mounting motors with an output of 400 W can be described as an over-specification
design. In contrast, hydraulic actuators make it easy to allot hydraulic energy to dif-
ferent legs through time division by using switching valves. Hence, for example, in
the robot, it would only be necessary to install an approximately 200 W motor to
drive the hydraulic pump. “(5) High stiffness (lock) state and relaxed (free) state can
be achieved easily by using switching valves, and other major merits include easy
shock absorption and backdrivability. If both the control values that link port A and
port B of a hydraulic actuator are in a “closed” state, the hydraulic actuator manifests
extremely high stiffness. Inversely, if both ports are in “open” state and the pressure
is balanced, it manifests high backdrivability. On the other hand, applying hydraulic
actuators to robots leads to challenges (6)–(9) listed in Table 8.1. A particularly high
efficiency and light weight “(7) Hydraulic power source” is a key device needed to
realize a mobile robot. In a hydraulic system, an electromagnetic motor constantly
drives a hydraulic pump, so “electric energy” → “hydraulic fluid energy” → “op-
erating energy,” and the energy conversion process is double that in the case where
it is directly driven by an electromagnetic motor. Therefore, the overall energy effi-
ciency is unavoidably lower than that of a robot with an electromagnetic motor drive.
“(8) Multi-degree of freedom” and “(9) designed for robot use” differ significantly
between conventional ordinary industrial-use hydraulic machinery and robot-use hy-
draulic machines. Generally, when a conventional industrial-use hydraulic machine
is compared with a robot, it is often either self-contained or a large heavy machine
with small degrees of freedom. For example, a normal 4-legged robot requires a
minimum of 3◦ of freedom for each leg, and the entire robot needs joints with a
minimum of 12◦ of freedom, and in many cases, the actuators themselves are in-
stalled on the movable parts. It is difficult to apply a conventional hydraulic actuator
premised on small degrees of freedom and stationary type to a robot without modi-
fication, requiring smaller and lighter actuators, cheaper control valves, more easily
arranged and more compact hydraulic hoses and connectors, and the development of
rotating/oscillating-type actuators. In addition, the current hydraulic actuators used in
industry are large and heavy because they are used to drive stationary machines with
small degrees of freedom. The minimum internal diameter of hydraulic cylinders
stipulated by JIS, for example, is 32 mm, but for robot use, cylinders with internal di-
ameters of 15, 20, and 30 mm are often demanded. As stated above, hydraulic robots
offer great potential with characteristics unachievable by electric motor robots; how-
ever, there are limitations to building a hydraulic robot using hydraulic components
already in use in industry and developing hydraulic components suitable for robots
is the key to realizing the true strong points of hydraulic robots.
8.1.3 Hydraulic Component Development System

of ImPACT-TRC
Considering the above circumstances, for ImPACT’s Rough Robotics Challenge, the
authors supervised the research and development of a hydraulic actuator that would
result in “tough robots.” This study demanded both “novelty” in terms of hydraulic
actuator research and “practicality” in terms of the ability to mount the actuators on
tough robots. It was not easy to achieve both in a period of 5 years, but led by the
Tokyo Institute of Technology, a research organization bringing together “industry
and academia” and “hydraulics and robots” was formed and shouldered both chal-
lenges. Three principles were established to guide this research and development.
The first was that the research and development would be conducted to meet the needs
of the industrial world including robots. The second was the aim to conduct the most
effective research and development considering the present state of hydraulics tech-
nology. To comply with the above two principles, as the research advanced, regular
meetings were held by specialists in robot engineering, researchers in charge of the
robot platform, specialists in hydraulic engineering, and people in charge of research
on hydraulic actuators (The Hydraulic Robot Research Committee). Third, a prac-
tical “robot-use tough hydraulic actuator” would be created in cooperation with the
hydraulic industry to take advantage of its technologies. While prioritizing the nov-
elty and originality of the research, the detailed design and trial fabrication of the
actual device were conducted in cooperation with specialist hydraulics manufactur-
ers with the aim of realizing a new hydraulic actuator with practical value. To realize
the second and third principles, the research and development was conducted by
an industry-linked research organization staffed by highly experienced researchers.
Table 8.2 summarizes the major organizations participating in the research and the
challenge studied by each.
We established close relationships with Professor Yoshinada (Osaka University)
and with Professor Takanishi and Lecturer Hashimoto (Waseda University), who
oversaw construction robots, which is the object of application (Chapter ** of this
Table 8.2 Major participating organizations and major development challenges

Participating organizations Research challenges
Tokyo Tech, Suzumori Lab. Application of actuators, valves, and power pack to tough robots
Okayama U., Kanda Lab. Super-compact control valves (particle excitation valves based on
voltage elements)
Ritsumeikan U., Hyon Lab. Intelligent controllers and servo boosters
JPN Co., Ltd. Compact, light-weight actuators
Bridgestone Corp. High-output artificial muscles
KYB Co., Ltd. Compact, lightweight, and highly efficient power packs
Osaka U., Yoshinada Lab. Application to complex robots
Waseda U., Takanishi Lab. Application to legged robots
Hydraulic Robots Research Advice from the perspectives of industry and academia, and from
Committee needs and seeds
book), and legged robots (Chapter ** of this book), respectively. In addition, the
Hydraulic Robot Research Committee, whose members included construction ma-
chinery makers, robot makers, and hydraulic machinery experts, strived to provide
varying perspectives. Figure 8.4 shows the system linking the project to the indus-
trial world that develops hydraulic components. By revising the view of hydraulic
actuators to develop hydraulic actuators specially developed for robots from the
perspective of robot engineering, the project participants conducted research and
development aiming to achieve tough robots with power and durability far superior
than those of electric motor robots and with working performance and dexterity far
superior than those of conventional construction machinery.
8.1.4 Outline of the Developed Components
This research focused on high output density, large generative force, shock resistance,
and environmental resistance to study reducing size, increasing intelligence, lower-
ing weight, achieving multiple degrees of freedom, and lowering sliding part friction
based on past hydraulics technologies with the aim of permitting hydraulic actua-
tor technologies to help realize durable robotics in order to achieve tough robots
to operate at disaster sites and under other extreme environments. Specifically, as
will be shown in detail in the following section, research and development of com-
pact, lightweight, and high-output actuators; rotating high-torque motors; low-sliding
cylinders/motors; power packs; high-output McKibben artificial muscles; particle
excitation-type control valves, hybrid boosters, and hydraulic control systems were
undertaken along with research on their application to tough robots. Here, a number
of these are summarized.
Tough robots
Legged robots,
Giacometti robots Locomotion robots Compound robots, Yoshinada Gr. Takanishi Gr.
Hydraulic robot Prof. Tanaka,

study group Hosei Uni.
Prof. Gen,
Suzumori Lab, Hydraulic cylinders
Ritsumeikan Uni.
Tokyo institute of JPN Co., Hydraulic motors
Piezo valve, Yuken Kogyo Ltd
technology
Prof. Kanda, Co., Ltd.
Okayama Uni.
Kyoei sangyo Corp. KYB Corp.
Bridigestone Corp. Power pack
Servo valve
Hydraulic hose and

High power muscle miniature couplings
Fig. 8.4 Industry-academia cooperation system for the development of robot use hydraulic com-
ponents
Fig. 8.5 External view of hydraulic WAREC
(1) Hydraulic actuators

Development in cooperation with JPN was conducted aiming at small-diameter
(internal diameter 10–30 mm), lightweight, high-output, and low-sliding char-
acteristics. Aiming for a rated drive pressure of 35 MPa, a minimum operating
pressure of 0.1 MPa, and high “power/self-weight ratio,” gaskets, structures, and
materials (titanium alloy, magnesium alloy, etc., were adopted) were devised.
As a result, cylinders with power/self-weight ratio six times higher than that
of conventional cylinders, based on JIS standards and extremely low minimum
operating pressure, were achieved.
(2) High-output artificial muscles

In cooperation with Bridgestone Corp., a hydraulically driven high-output McK-
ibben artificial muscle was developed. The McKibben artificial muscle operates
using the elastic transformation of rubber, so the sliding friction is extremely
small and can be relied on to perform sensitive force control and precise posi-
tioning control.
(3) Particle excitation valves
Among the recently developed robots, many have degrees of freedom exceeding
several tens, requiring the development of compact inexpensive control valves.
The authors developed new control valves in cooperation with Professor Taka-
fumi Kanda of Okayama University, based on the particle excitation valves pre-
viously developed for air-flow control use.
(4) Hydraulic tough robots
Three examples of the application of hydraulic actuators that were developed into
“tough robots” were studied. One was a multi-finger hand, which was applied to
a construction robot (described in detail in Sect. 8.6). The second was a legged
robot. Hydraulic motors were newly developed for each of the seven shafts based
on legged robot specifications to complete one leg of the legged robot (Fig. 8.5).
While this project did not include the fabrication of an actual prototype, taking
advantage of the characteristics of the developed actuator, including compactness,
light weight, and high output, a narrow-diameter and long robot arm was realized.
Figure 8.6 shows an example of one that is reliable to be used at disaster sites, for
example, to enter narrow spaces for search and rescue work.
Fig. 8.6 Image of the target tough robot arm using a “lightweight and high-output” hydraulic
actuator (arm length: 10 m, diameter: 0.3 m, tip payload: 100 kgf)
8.2 Low-Friction High-Power Components
Compared to conventional hydraulic machines, robots have multiple degrees of free-

dom in motion and fine control for highly advanced tasks. From this perspective, low
friction and high power are the main challenges for hydraulic components in robotic
usage, as these contribute to the multiple degrees of freedom with a rapid response
and fine control in precision work.
This section introduces our work on hydraulic components for tough robots. We
focus on low friction and high power, especially on 35-MPa low-friction and high-
power actuators, a power pack for autonomous driving of hydraulic robots, and pe-
ripheral equipment for hydraulically driven systems, including hoses and couplings.
8.2.1 Tough Hydraulic Actuators Operated by 35 MPa
In general, conventional hydraulic actuators can be classified into hydraulic cylin-

ders and oscillating motors, as shown in Figs. 8.7 and 8.8, respectively. Hydraulic
actuators have several advantages: high output density (high force or torque ratio to
their body mass), simple driving in bidirectional motion using control valves, and
high back drivability because of the lack of reduction gears. These features bring
higher shock resistance and environmental robustness than drive systems composed
of electric motors and reduction gears. However, in spite of the advantages of hy-
draulic actuators, most conventional hydraulic actuators are not suitable for robotic
usage because of their large weight and bulky size. In addition, those conventional
Cap cover Rod cover Piston rod

Cushion ring Bush
Rod wiper
Cylinder tube
Rod packing
Tie rod Piston packing Cushion valve Keep plate
with integrated check valve
Fig. 8.7 Typical structure of hydraulic cylinder

Fig. 8.8 Typical structure of Key way

hydraulic oscillating motor
Stroke end
Oscillating
angle
Internal
stopper
Vane shaft
actuators need operating pressures of 0.3–0.8 MPa even under no load because of
the high sliding resistance owing to the packing that prevents leakage of the working
fluid.
For robotic usage, we have developed lightweight hydraulic actuators using
lightweight alloys whose regular pressure is 35 MPa. Sliding resistance was reduced
by appropriate selection of structures and materials for the sealings, minimizing sur-
face roughness of the sliding parts, and strict control of dimensional tolerance and
machining temperature. The developed actuators achieved a no-load drive with an
operating pressure of 0.15 MPa with almost no leakage of the working fluid.
8.2.2 Tough Hydraulic Cylinder
8.2.2.1 Improvement in Output Density
As shown in Fig. 8.9, the developed cylinder successfully achieved twice the output
density of that of cylinders regulated by Japanese Industrial Standards (JIS) and the
International Organization for Standardization (ISO). Note that the output density is
the ratio of the maximum thrust force to the mass of the cylinder here.
Based on the discussion above, we developed cylinders for hydraulic robots op-
erated by 35 MPa whose diameter is 20–60 mm. Figure 8.10a, b are examples of the
developed cylinders. These cylinders are applied to a tough robotic hand (the details
of which will be mentioned in Sect. 8.6).
Fig. 8.9 Comparison of

cylinder output density
8.2.2.2 Low-Friction Drive
Figure 8.11a shows the result of measuring the sliding pressure of the developed
cylinder (Fig. 8.10a). In the measurement, the cylinder drove with no load for its full
stroke of 100 mm.
Figure 8.11a, b show the time response of the operating pressures and the position
of the pistons on the pushing side and the pulling side, respectively.
Although the pressure increases near the stroke end point, the operating pressure
achieved is <0.01 MPa.
8.2.2.3 Hydraulic Oscillating Torque Actuator
Oscillating torque actuators that can be operated with 35 MPa were also developed,
and the output density reached higher values, as shown in Fig. 8.12. The developed
motors were designed to drive with no load operated by <0.2 MPa. The vane packing
was molded out of a special resin (Fig. 8.13). The developed oscillating actuator,
depicted in Fig. 8.14, has one vane, with a rotating angle of 270◦ , and an output
torque of 160–670 Nm. A 360◦ oscillating motor was also developed for the high
requirements in robotic usage (Fig. 8.15). This oscillating motor can generate 600 N
m of torque with an applied pressure of 35 MPa. Using this type of hydraulic motor
can enable a design with a smaller outer diameter, which is advantageous for a
compact hydraulic robot design.
300 mm
(a) MDF Magnesium alloy type [25]
275 mm
(b) Titanium alloy type
Fig. 8.10 Prototypes of hydraulic cylinders using lightweight alloy
8.2.3 Power Pack for Hydraulic Robots
8.2.3.1 Basic Concept
Hydraulic systems require pumps as high-pressure sources that are driven by elec-
tric motors or engines to supply hydraulic power to hydraulic actuators. Therefore,
hydraulic robots need hydraulic powerpacks that integrate the driving equipment,
including hydraulic pumps, valves, and motors or engines. The developed power
pack is shown in Fig. 8.16. The power pack was designed for being applied to the
legged robot developed by Hashimoto et al. [2].
In general, the locomotive range of robots is limited in cases in which electric
motors are used as actuators and the electric energy for driving is supplied by an
outer energy source. This is inconvenient for some cases that require robots. For this
0.1 50
0.09 45
0.08 40
Position
0.07 35
Pressure [MPa]
Position [mm]
0.06 30
0.05 25
0.04 20
0.03 15
0.02 Pressure 10
0.01 5
0 0
0 0.5 1 1.5 2 2.5 3 3.5 4
Time [s]
(a) Push direction
0.1 100
0.09 95
0.08 Position 90
0.07 85
Position [mm]
Pressure [MPa]
0.06 80
0.05 75
0.04 70
0.03 65
Pressure
0.02 60
0.01 55
0 50
0 0.5 1 1.5 2 2.5 3 3.5 4
Time [s]
(b) Pull direction
Fig. 8.11 Results of measuring the minimum driving pressure
reason, we developed independent power packs using batteries or fossil fuel as an

energy source for hydraulic power from outer energy sources. These independent
power packs can offer a wider range of locomotion to hydraulic robots. The power
pack was designed to satisfy the following specifications for small, lightweight, and
high-power driving: an output pressure with an instantaneous maximum of 35 MPa
and a continuous maximum of 21 MPa.
Fig. 8.12 Comparison of

oscillating torque actuator
output density
Fig. 8.13 Vane packing
25 mm
8.2.3.2 Configuration of Power Packs
Figure 8.17 shows the hydraulic circuit diagram of the developed power pack. As
mentioned above, an electric motor or an engine is used to drive the pump. The
power pack starts with no load, and a relief valve is set to the maximum supply
pressure as a load after the rotating condition is stabilized. The relief valve works as
a safety valve. The working fluid output from the pump is supplied to loads, including
actuators, via a check valve and a filter. In the case of higher output pressure than the
precharge pressure of the accumulator, surplus flow is charged to the accumulator
and the charged flow increases according to the increase of the outpressure of the
pump. The pump can maintain the desired flow rate by controlling the output flow
rate with a pressure-compensating mechanism. When the output pressure decreases
because of an increase in the flow rate, the pump increases the output flow rate and
the accumulator can support this work. This function of the accumulator enables
actuation with higher flow rates than the maximum flow rate of the pump.
8.2.3.3 Main Components
A hydraulic pump is one of the key components in the power pack. For several
years now, Takako Industries, Inc., has been producing swash plate piston pumps
that can output a maximum pressure of 21 MPa. We designed our power pack based
80 mm
Fig. 8.14 Developed oscillating torque actuators
220 mm
Fig. 8.15 Developed 360◦ oscillating torque actuator
on a pump whose piston volume is 1.6 mL3 /rev to satisfy the flow rate and pressure
required to realize functional motion of the legged robot. The maximum rotating
speed of 3000 rpm and the maximum operating pressure of 21 MPa were improved
to 5000 rpm and 35 MPa, respectively. These improvements result in a tripling of the
performance of the original pump. The prototype also can change the output flow rate
with a pressure compensation mechanism to improve efficiency. The control valves
are small and of modular design (NG3-Mini series, Wandfluh, Switzerland), and the
filter is also modular (MVF-01-3C-I, Taisei Kogyo Co., Ltd., Japan).
8.2.3.4 Driving Source of Power Pack
Two types of electrically driven power packs were developed: one uses an electric
motor actuated by the alternative voltage of 200 Vrms supplied from outside of a
robot, and the other has a DC electric motor driven with a battery on the power pack.
Engine
Oil tank
Pump
Accumulator
Fig. 8.16 Photograph of the developed power pack
Fig. 8.17 Schematic of the developed power pack
A four-cycle gasoline engine was adopted for the power pack driving engine for
its low emission gas, vibration, and noise.
Table 8.3 summarizes the features of each power pack type.
Table 8.3 Features of three types of power pack configuration

Configuration type Advantage Disadvantage
Electric motor & external No limitation for operating time Limitation for operating
supply Easy to design lightweight robot range
Electric motor & battery No limitation for operating range by Larger increasing ratio of
supply line weight to operating time
than gasoline engine
Engine & gasoline tank No limitation for operating range by Noise and emission gas
supply line
Smaller increasing ratio of weight to
operating time than battery
Because the three types of power packs each have unique features, designers
should appropriately select the type of power packs according to conditions such as
operating environment, working objective, and driving duration.
8.2.3.5 Small Hose and Fitting Adaptable to High-Pressure

Hydraulic Systems
Hydraulic systems are not as convenient in hydraulic circuits compared to the case
of electric circuits. In particular, several problems must be solved for wide usage
of hydraulics in robots. These problems include oil leakage at connections of pipes
and/or hoses and difficulty in designing hose routing owing to limitation of the
minimum bending radius.
In addition, sealing tape should be used to connect hydraulic devices by tapered
screws (as is widely used in hydraulic devices); this sometimes contaminates the
working fluid and leads to failures. This research basically uses straight screws to
avoid the aforementioned problem and simplify the piping work.
Fig. 8.18 G1/8 type hose for

35 MPa operation
10mm
60 mm
Fig. 8.19 Small-size swivel joints adaptable to 35 MPa
As robotic usage requires high pressure but low flow rate compared to conventional
hydraulic devices, a pipeline diameter of ∼1/8 in. is used in general. In spite of this
requirement, it is difficult to obtain fittings and hoses that are adaptable to 35-MPa
continuous operation with a 1/8 in. size. Figure 8.18 shows the developed G1/8 type
hose that can be used with 35-MPa operation. The developed swivel joints adaptable
to 35 MPa are shown in Fig. 8.19. (The left and right photographs in Fig. 8.19 are for
three axes and multiple axes, respectively.) Appropriate usage of these components
is expected to have great potential for composing small, high-pressure hydraulic
systems.
8.3 McKibben Artificial Muscle
8.3.1 Structure and Specific Features of McKibben

Artificial Muscle
In this project, we studied a hydraulically-operated McKibben artificial muscle.

McKibben pneumatic artificial muscles were studied in artificial limb research in the
1960s [30] and this artificial muscle was recently commercialized by Bridgestone
Corporation for robotic applications. Figure 8.20 shows a schematic representation
of a McKibben artificial muscle. The structure is very simple, and is composed of
a rubber tube and braded cords. When the tube is pressurized, it shrinks in the ax-
ial direction by means of the braded cords, which has a geometric structure that is
essentially a network of pantographs. It is called an artificial muscle because the
behavior is akin to that of a muscle in that when it expands in the radial direction, it
simultaneously shrinks in the axial direction.
The specific features of the McKibben artificial muscle compared to conventional
actuators such as cylinders and electric motors can be listed as follows:
Fig. 8.20 Schematic of

McKibben artificial muscle
Rubber tube Braded cords
• Ultra lightweight, powerful. (Force-to-weight ratio is greater than that of conven-

tional cylinders and electric motors)
• Novel type of flexible and high-power actuator leads to the possibility of new
applications
• High durability against impact and vibration because of their flexible constituent
materials
8.3.2 Static Characteristics of Hydraulic Artificial Muscles

and the Goal of This Study
There are several analytical methods that have been proposed for the McKibben
artificial muscle. In this study, we consider the contraction force and the rate of con-
traction using the Schulte’s method [30]. The contraction force F can be written as:
π D02 P 1
F= 3(1 − ε)2 cos2 θ0 − 1 (8.1)
4 sin2 θ0
where P is the applied pressure, ε is the contraction ratio, D0 is the initial diameter
of tube and θ0 is the initial braded angle, respectively. Equation 8.1 indicates that in
order to obtain a high contraction force, a high pressure must be applied. Conventional
pneumatic McKibben artificial muscles are driven by air pressure at approximately
0.5 MPa. There is the danger of explosion due to the application of high pressures
because air is a compressive fluid. Therefore, we utilized a hydraulic system in order
to achieve the research objective.
In this study, the goal was to achieve a contraction force in excess of 5 kN by
applying pressure in excess of 5 MPa. These target values are 10 times higher than
that for a conventional system.
8.3.3 Experimental Results and Discussion
8.3.3.1 Prototype of Hydraulic Artificial Muscle
We considered a prototype of a hydraulic artificial muscle which was hand-made

in our laboratory with an outer diameter of 15 mm, a rubber tube with an outside
diameter of 13.1 mm and a rubber tube with an inside diameter of 9.5 mm. The
Fig. 8.21 Prototype of hydraulic artificial muscle. It has a length of 280 mm and an outer diameter
of 15 mm
Table 8.4 Experimental results for maximum contraction force and maximum contraction rate
Applied pressure 1 MPa 3 MPa 5 MPa 7 MPa
Maximum force (kN ) 1.9 5.2 8.8 12.0
Maximum rate (%) 23 28 30 31
pressure limit of the McKibben artificial muscle depends mainly on the strength of
the cords. Therefore, high-strength cords made of aramid fibers were applied [21].
Using this prototype, a contraction force of more than 5 kN is expected at a pressure
of 5 MPa, according to the Eq. 8.1.
An image of the prototype of the hydraulic artificial muscle is shown in Fig. 8.21.
8.3.3.2 Contraction Force and the Rate of Hydraulic Artificial Muscle
The contraction force of the hydraulic artificial muscle was measured using a load
cell, and the length was simultaneously measured using a laser displacement meter.
Pressure values of 1, 3, 5 and 7 MPa were applied. The results are represented in
Fig. 8.22. In this figure, the values calculated using Eq. 8.1 are also plotted. Although
there are deviations between the experimental results and the calculated values which
result from friction between the rubber tube and the cords, we confirmed that the
deviations are not significant. Table 8.4 shows the maximum force and the rate of
contraction for the respective applied pressure values. It can be seen that the target
value of more than 5 kN is achieved at a pressure of 5 MPa.

results for the contraction
rate and contraction force of
a hydraulic artificial muscle
One of the differences between the McKibben artificial muscle and conventional
actuators is that the contraction force of the former depends on its contraction rate
as was observed in Fig. 8.22. Therefore, we considered an index of EDM [17] in this
study in order to evaluate the energy efficiency of the actuator. EDM can take into
account the variation of the contraction force depending on the contraction rate of an
artificial muscle. This parameter is given by the value of the actuator energy divided
by mass. The EDM of a McKibben artificial muscle EDMAM can be written as:

1
EDMAM = Fdx (8.2)
M
π D02 L0 P 1 2
= cos θ − (1 − εM )3 cos2 θ0 − εM (8.3)
4M sin θ02
Where L0 is the length, M is the mass and εM is the maximum contraction rate of the
McKibben artificial muscle. The EDM of a conventional hydraulic cylinder EDMc
can be written as:
fs
EDMc = (8.4)
m
Where f is the force, s is the stroke and m is the mass of the cylinder. Table 8.5
represents a comparison of EDM between the prototype hydraulic muscle and a
hydraulic cylinder. Here, as an example of a hydraulic cylinder, PH-1ST15x168-T
[14] is compared with hydraulic artificial muscles. The applied pressures are 21 MPa
for the hydraulic cylinder and 5 MPa for the hydraulic artificial muscle respectively.
Table 8.5 EDM of hydraulic Hydraulic cylinder Hydraulic artificial muscle

cylinder and hydraulic
artificial muscle 126.3 825.4
Fig. 8.23 Experiments involving resistant to external shocks and vibrations. The length and outer
diameter of artificial muscle are 500 and 13.1 mm respectively
The length is 280 mm for both. It is clear that the EDM index of the hydraulic
artificial muscle is much higher than that of the hydraulic cylinder. From this result,
the hydraulic artificial muscle is more efficient than the hydraulic cylinder.
In the following sections, we will demonstrate the unique properties of the artificial
muscle which is a flexible actuator made of a flexible rubber and braded cords.
8.3.4 Investigation of Resistance to External Shocks

and Vibrations
The developed artificial muscle consists of a rubber tube surrounded by braded cords,
so it is highly resistant to strong external shocks and vibrations. This inherent design
property is expected to result in Tough Robots that can handle work where shocks are
applied, which is difficult for existing electric-motor-driven robots to handle, e.g.,
creating holes in walls using an impact drill, chipping concrete walls, etc.
Figure 8.23 shows an example of an experimental device to induce strong external
shock and vibrations to an artificial muscle. We applied an antagonistic drive system
to verify the shock resistance by concrete chipping [22].
Fig. 8.24 3D CAD model of 3 DoF wrist mechanism
0.0 sec. 0.5 sec. 1.0 sec.
2.5 sec. 2.0 sec. 1.5 sec.
Fig. 8.25 3 DoF movement of the wrist mechanism by 0.5 MPa air pressure
8.3.5 3 DoF Wrist Mechanism for Tough Robots
Hydraulic actuators are flexible and light artificial muscles for robots that can generate
very high forces due to the hydraulic pressure of oil. In addition, these muscles can
be placed and twisted freely. In this section, we propose a 3 DoF wrist mechanism
that is compact, flexible and light, which consists of hydraulic artificial muscles.
Figure 8.24 shows a 3D CAD model of the 3 DoF wrist mechanism. The Pitch
axis and the Yaw axis in Fig. 8.24 are universal joints, and it is possible to operate
the system with a 2 DoF using four hydraulic artificial muscles. In addition, the Roll
axis that rotates the entire wrist is operated by two hydraulic artificial muscles. This
Roll axis is the so-called twist rotation of the wrist, and the operation is realized by
arranging the artificial muscle to wind around the structure by utilizing the flexibility
of the hydraulic artificial muscle. In this example, an air pressure of 0.5 MPa was
applied for operation using all six hydraulic artificial muscles. The motion is shown
in Fig. 8.25. Through this study, we confirmed that a smooth three-axis motion can
be realized using flexible actuators. It is expected that this arrangement will realize
unique robotic movements [23, 24].
8.3.6 Conclusion
The developed artificial muscle was able to operate at a pressure level of 5 MPa,
which was much higher compared to a conventional McKibben type artificial muscle.
Therefore, it was possible to generate a significantly higher amount of power.
In this study, we developed an innovative, lightweight, and highly powerful artifi-
cial muscle that has excellent pressure and oil resistance, and is capable of converting
high hydraulic pressure into efficient power generation. It is an innovative actuator
with a strength-to-weight ratio that is much greater than that of conventional electric
motors and hydraulic cylinders.
The developed artificial muscle is composed of a rubber tube surrounded by braded
cords and is therefore highly resistant to strong external shocks and vibrations. It is
expected to result in the production of Tough Robots that can handle work where
large external shocks are applied. This is difficult for existing robots driven by electric
motors to handle, e.g., making holes in walls using an impact drill, chipping concrete
walls, etc.
We will continue to proceed with the development and implementation of Tough
Robots that exploit the newly developed hydraulic artificial muscle in order to con-
tribute to the realization and deployment of advanced robotic services for a safe and
secure society. In addition, we aim to achieve higher performance and to promote
their application and development as actuators for robotic systems.
8.4 Particle Excitation Valve
In many industrial fields, fluidic actuators, including pneumatic and hydraulic ac-
tuators, have been widely used in high-power systems. These fluidic actuators have
also been utilized for mechanisms with multiple degrees of freedom. Recently, robots
using hydraulic actuator systems have been attracting attention as owing to their high-
powered motion and high back-drivability. However, hydraulic actuator systems are
generally large because such systems require many large control components. Thus,
there is a demand for small fluidic components to realize a small and light weight
control system. This section, introduces a novel small valve. The purpose of this
research is to develop a small hydraulic flow control valve for hydraulic actuators. A
Fig. 8.26 Basic principle of the particle excitation valve using particles oscillated by a piezoelectric
transducer
Fig. 8.27 Fabricated small two-way valve using particle excitation by a piezoelectric transducer
small flow control valve with particle excitation by menas of a piezoelectric vibrator
has been applied to a hydraulic system. In addition, we developed a three-way valve
using two vibrators, and applied this three-way valve to fluidic actuator systems. This
valve is compact, is light weight, and can be driven by piezoelectric transducers to
switch the supply and drain of the working fluid of the actuator.
8.4.1 Principle of Particle Excitation Valve
In the proposed valve, particle excitation is utilized for flow control. Ball or particle
excitation has been used for flow rate control valves to control fluidic actuators [3–7,
16, 27, 34–37]. In this section, a valve using some particles and an orifice plate
with orifice holes is proposed. Figure 8.26 shows the basic principle of the particle
excitation valve [3]. When the particles are excited by the orifice plate oscillated by
the piezoelectric transducer, the flow rate of the valve is controlled by the voltage
applied to the piezoelectric element of the transducer. Some types of valve using this
principle have been used in pneumatic control applications [3–7, 34].
Figure 8.27 shows a prototype of the small two-way valve using particle excitation
induced by a piezoelectric transducer. The valve is very small and is light weight.
It has a diameter of 10 mm, length of 9 mm, and weight of 4.6 g. The diameter of

between the flow rate of the
two-way valve and the
voltage applied to the
piezoelectric transducer with
water as the working fluid
the orifice plate is 0.4 mm. Stainless steel balls are used as the working particles
and the diameter of the ball is 0.7 mm. Piezoelectric transducers using the resonance
of vibrators have been applied in high-power applications. When the piezoelectric
transducer generates high-power vibrations, the valve is also used for hydraulic con-
trol. The prototype valve has been tested as a hydraulic valve. Figure 8.28 shows the
relationship between the water flow rate and the voltage applied to the piezoelectric
transducer. Although a dead zone was observed, the flow rate was controlled by the
applied voltage. In addition, the velocity of the hydraulic cylinder was controlled
successfully using this valve [35, 36].
8.4.2 Three-Way Valve
To control a fluidic actuator, both the inlet and outlet of the working fluid must be
controlled. A three-way valve is a component that has an inlet and outlet for a fluidic
actuator. A three-way valve to control hydraulic artificial muscle type actuators has
been designed and fabricated. As the valve is small, it can be built into the actuator
unit easily. In this three-way valve, the supply and drain of the working fluid is
switched by using two piezoelectric transducers installed in each port [16].
The cross-sectional view of the prototype of the proposed three-way valve is
illustrated in Fig. 8.29. The actuator port is connected with a fluidic actuator. The
Fig. 8.29 Cross-sectional

view of the proposed
three-way valve consisting of
two orifice plates oscillated
by piezoelectric transducers
Fig. 8.30 Structure of the piezoelectric transducer and FEM simulation results of modal analysis
of the transducer
Fig. 8.31 FEM results of the vibration of each transducer oscillated by switching the driving
frequency
inlet and outlet ports have orifice plates for the supply and drain of the working fluid.
In each port, particles are excited by each orifice plate to control the flow rate of
the port. As the two orifice plates are of different sizes, they have different natural
frequencies. As schematic of a transducer is shown in Fig. 8.30, where the structure is
based on a bolt-clamped type piezoelectric vibrator. Ring-type PZT plates are used for
the oscillation, and the electrodes are made of copper. Some piezoelectric elements
and electrodes are clamped by bolt and nut to apply a preload to the piezoelectric
elements. The orifice plate is oscillated in the resonance mode at the piezoelectric
transducer. The natural frequency of the orifice plate can be simulated by using the
finite element method (FEM). The inlet and outlet of the valve need to be connected
with joint parts for the supply and drain of the working fluid. The transducer was
designed to avoid damping of the vibration at the orifice plate, as shown in the FEM
result in Fig. 8.30.
The prototype of the three-way valve was designed using modal analysis with
FEM. Figure 8.31 shows the simulation result of the vibration mode. The results
Fig. 8.32 Schematic of the three-way valve and photo of the fabricated valve
show that each orifice plate can be oscillated by its natural frequency. Each orifice
plate is oscillated at the first flexural vibration mode. In this case, the inlet and outlet
port orifices are opened at 121.7 and 168.2 kHz, respectively. The three-way valve
was designed to be fixed at the actuator port, where the vibration displacement is
negligible. Figure 8.32 shows a schematic of the designed three-way valve and a
photo of the fabricated valve. The height, outer diameter, and mass of the three-way
valve are 35.0, 15.0 mm, and 26.5 g respectively. The orifice hole and stainless steel
balls have diameters of 0.4 and 0.7 mm, respectively. In the orifice plate, nine orifice
holes are arranged at intervals of 0.2 mm at a distance of 0.4 mm from the center.
The performance of the fabricated device as a vibrator was evaluated. Figure 8.33
shows the measurement results of the relationship between the vibration velocity of
the orifice plates of the fabricated three-way valve and the driving frequency of the
transducer. The vibration velocity at the center of the orifice plate was measured using
a laser Doppler vibrometer. The plots show the different frequency distributions.
The results indicate that the orifice plates of the inlet and outlet ports have different
resonance frequencies.
The relationship between the flow rate of each port and the applied voltage for
driving of each piezoelectric transducer is shown in Fig. 8.34. In this case, the supply
pressure was 0.1 MPa. The results indicate that the flow rate of the working fluid was
controlled by the voltage applied to the transducer.

between the vibration
velocity of each orifice plate
and the driving frequency of
the piezoelectric transducer

between the flow rate at each
port of the fabricated
three-way valve and the
voltage applied to each
transducer at an impressed
pressure of 0.1 MPa
In addition, the proposed valve can also be used for the control of hydraulic actu-
ators. Some prototypes have been applied for the flow rate control of cylinders and
artificial muscle type actuators. In Fig. 8.35, a three-way valve is attached and con-
nected to the plug of a McKibben artificial muscle. The velocity of the actuator was
controlled successfully using the three-way valve. Artificial muscles are lightweight
and can be used in high power applications. The proposed valve is suitable for con-
trolling artificial muscle actuators to realize a small and light weight actuator system
for robotics applications.
8.4.3 Conclusion
In this study, a novel flow rate control valve for the control of hydraulic actuators is
proposed. The three-way valve, which has a particle excitation mechanism driven by
a piezoelectric transducer, has been designed, fabricated, and evaluated. In the fabri-
cated three-way valve, the operated port was successfully switched by piezoelectric
transducers experimentally. The fabricated three-way valve can also be used for the
control of hydraulic actuators.
Fig. 8.35 Fabricated three-way valve attached to a hydraulic artificial muscle
8.5 Hydraulic Hybrid Servo Booster
In the ImPACT-TRC project, the authors studied a new hydraulic circuit for con-
struction robots or legged robots in order to make them not only tough, but also
high performance in motion control. The circuit was compactly integrated into a
servo-unit, then duplicated and connected to one common low-pressure line. Thanks
to the boosting effect, it is expected that one can simultaneously achieve high-load
and high-precision servo control performance with low-cost hydraulic components.
This section describes the principle of the new circuit, then present the performance
of a linear slider testbed with a single-rod cylinder.
8.5.1 Background
In hydraulic industry, high-precision servo applications have been heavily relying on

servovalves. This is because servovalves can achieve highly dynamic flow control
because of its fast spool response and large pressure gains. Unfortunately, however,
the energy efficiency is extremely poor and its treatment is not easy when compared to
electric servomotors. This is why hydraulic hybrid technology has been paid attention
in industrial applications for past decades. A servomotor-driven (or inverter-driven)

pump one-to-one connected to the actuator, called Electro-Hydrostatic Actuators
(EHA) [18], is one of the main outcomes. Energy storage and regeneration with
hydraulic accumulators are another fruits of hydraulic hybrid technology [29, 31].
Compared with conventional hydraulic servo systems with servovalves,
servomotor-controlled EHAs have great advantage in its energy-efficiency. However,
the servo-motor must be larger as the required power becomes higher. Accordingly,
not only the cost increases: the precision and dynamic response deteriorates (large
servo-pumps cannot outperform smaller servo-pumps in dynamic response and flow
control resolution).
On the other hand, in mobile applications, conventional open circuit has been
revised using a concept of independent (or separate) metering [13]. Wherein, four
valves (usually 2-way cartridge valves) connected to inlet and outlet chambers of the
cylinder (Wheatstone bridge configuration) are independently controlled (usually in
a proportional manner) to track the desired trajectories of the flow and/or pressure
while saving the energy loss at throttling. Once the new concept has been established,
it naturally leads to advanced controller design (e.g. [38]), or other extensions (see
[26] for the displacement control of the main pump).
8.5.2 Novel Hydraulic Hybrid Circuit
The above mentioned studies are still ongoing in hydraulics community, where the
central topics are energy saving, robustness and cost effectiveness for mobile appli-
cations. Motivated by the background, we invented a new hydraulic circuit, called
Hydraulic Hybrid Servo Booster (H2SB) for high-precision control [8]. The circuit
has a main pump, four valves (V1–V4) and a servo-pump, which are connected to
the inlet/outlet of the cylinder, as illustrated in Fig. 8.36. It embeds a servomotor-
controlled pump into the valve bridge so that the high-speed (up to valve size) and
precision (up to servopump resolution) are achieved in a hybrid and cost-effective
manner. The boosting effect enables us to employ a heavy load and a high-precision
servo drive without using larger servo motors. The circuit has three control modes:
(1) Open-Circuit Mode
When V1 and V4 (or V2 and V3) are open while V2 and V3 (or V1 and V4) closed,
the pressurized oil (or water) from the common pressure line flows into the inlet
side of the cylinder chamber, then return to the reservoir from the outlet side of
the chamber. With large volume of the pumps and/or accumulators, the high-speed
motion is possible according to the valve size. The speed control can be achieved
by throttling the valves. However, the pressure applied to the cylinder cannot exceed
Ps , the pressure of the main pump. Therefore, Ps should be controlled to keep the
minimum-required value according to the operating conditions.
V3 V4 V3 V4 V3 V4
V1 V2 V1 V2 V1 V2
Pm Pm
SM SM SM
P P P
M M M
Fig. 8.36 Hydraulic Hybrid Servo Drive consists of a common pressure line, valve bridges and
servo-pumps (The boxed section represents is duplicated for multi-axis applications): (left) Open
circuit mode, (Center) Closed circuit mode, (Right) Boost mode
(2) Closed-Circuit Mode (EHA Mode)

Opening V1 and V2 while closing V3 and V4 leads to conventional EHA mode. The
H2SB utilizes a small pump to obtain good flow control resolution. Otherwise, it is
impossible to achieve precise motion control. This EHA mode has two advantages:
First, since the inlet and outlet ports of the servo pump are charged (pressure elevated),
cavitation is avoided. Second, undesirable over-pressure, which is typical in the
differential cylinder, can be avoided by opening the valves properly. In this way,
conventional EHA circuits are completely embedded in H2SB. However, EHA mode
cannot perform fast motion simply because the servo-pump is small (fast motion is
achievable with the open-circuit mode).
(3) Boost Mode
When the external positive load is given to the cylinder rod, rotating the servo-
pump while in the open-circuit mode leads to a boosting mode; the inlet pressure
boosts up to Ps + Pm , where Pm is the pressure difference generated by the servo-
pump. Small pumps have no problem in generating high pressure with small input
torque. This feature is very different from electric drive, where high torque demands
“large” gears. Also, it is important to note that a smaller servo pumps, in principle,
outperform larger servo pumps in precise control. The flow control resolution is
completely determined by the size of the pump and actuator, as well as the speed
controllability of the servomotor.
The proposed circuit combines a well-known boosting circuit and a closed-loop
circuits “in hydraulic manner”. Note that one can build various equivalent circuits
Table 8.6 Comparison of conventional EHA and H2SB

EHA H2SB
Servo-pump Required Required
Main pump Not required Required
Charge pump Required Not required
Maximum flow Servo pump Main pump
Maximum pressure Servo pump The sum of both
according to the selection of the four valves from V1 to V4. If we don’t need the
EHA mode, two valves V1 and V2 can be merged into one pilot-operated spool
valve, which makes the system compact. The realization strongly depends on the
user applications.
Table 8.6 compares the EHA and H2SB. As a design example, suppose the main
pump and the servo pump have the same pressure capacity. This H2SB system can
generate twice larger force than EHA systems, while preserving the same precision as
EHA. We can make the main pump small when the power consumptions of the each
sub system are averaged and/or accumulators are introduced properly. Regeneration
with accumulators and batteries, as many existing hydraulic hybrid systems adopts,
can be introduced as necessary.
8.5.3 Why H2SB Is Useful for Robots?
Our purpose in the ImPACT-TRC project is to apply the H2SB to robots to make
them have the following features:
(F1) Toughness: Simple structure of the components

(F2) Compactness: Drive circuit put outside of the robot
(F3) Speed: Common pressure source to output high flow
(F4) Precision: Small servo-pump with high flow resolution
(F5) Silence: Small (gear) pumps
(F6) Less heating: Less pressure drop at valve throttle
(F7) Economy: Low-cost valves, motors and pumps
(F8) Compliance/backdrivability
(F9) Joint torque controllability
The items (F1)–(F3) are naturally led from well-known properties of hydraulic
systems. To achieve (F3) and (F4), conventional robots are equipped with high-power
servomotors and high-precision gears to achieve the speed and force simultaneously.
The high gear ratio provides ultra-high precision. EHA drive can mimic this feature,
but the system size becomes huge.
Fig. 8.37 Expected feature Typical robots driven by

of the new robot (position Servomotor + Gear
Position
control case)
Target
Equally
high precision
Time
Hydraulic hybrid robots

Work time reduction
Slow Fast Slow
Volume Throttle Volume Time

control control control
Instead of aiming these dual objectives simultaneously, our hydraulic robots

equipped with H2SB are expected to have high-speed motion at low-load condi-
tions, but precise positioning at high-load and slow-speed conditions. This can make
the robots, for example, perform a point-to-point task much fater and much precise
with the same total capacity (power) of the system as illustrated in Fig. 8.37. Also, we
can make the end-effector force very high using small servo pumps, if the required
speed is low. This feature is useful for pushing heavy object or end-effector tools in
some filed applications.
8.5.4 Experimental Evaluation of H2SB on a Linear Slider
8.5.4.1 A Slider Testbed and the Hydraulic Circuit
As the first step to prove the above-mentioned concept, we built a 1-axis linear slider
as shown in Fig. 8.38. This is a three-pole slider which can travel vertically within
the cylinder’s range of motion; 200 mm. The slider is driven by a standard off-the-
shelf single-rod (differential) cylinder where the cross-sectional areas of the chamber
on the cap side and rod side are different. The cylinder piston and rod diameters
To reservior
V3 V4
Pa
Hoses (1 m)
Pa
Pb
Pb
Linear
ball bush Pm
No load (100mm) V1 V2
Slider P1 P2
Load (100mm)
Springs Ps
Main pump
Fig. 8.38 Simple slider testbed (composed of a single-rod cylinder, slider, springs) and the H2SB
circuit (composed of a 0.18 cm3 /rev micro piston pump, a 100 W BDC servomotor, a miniature
proportional valve and two solenoid cartridge valves)
are 32 and 18 mm, respectively. Three springs with the maximum compression of
100 mm are introduced for loading. The maximum load is 8640 N at 100 mm, which
corresponds to approximately 11 MPa at the cap side pressure Pa . The height of the
slider is measured by a potentiometer via a 16-bit AD converter.
For the realization of H2SB circuit, we chose solenoid on/off valves for V1 and
V2 and a miniature four-way proportional valve for V3 and V4. Two check valves,
pressure sensors, together with the relief valves (for safety; not shown in the diagram)
are plugged into the custom manifold fabricated in the author’s lab. For the servo
pump, we introduced a micro gear pump (internal gear pump) with the volume
0.18 cm3 /rev. The pump is coupled with a 100 W brushless DC (BDC) servo motor.
The main pump (2.8 kW) put outside of the testbed is pressure-controlled (the pump
is regarded as a constant pressure source).
8.5.4.2 Control Method
The overall control architecture in this work is presented in Fig. 8.39. This consists of
the displacement control of the servo-pump and the throttle (meter-in and meter-out)
control of the valve bridge. For the servo-pump controller, we can use a simple PD
and PID controller for both position and force control. For the valve controller, we
follow the typical method adopted in the IMV literature. That is, we select valves
according to the direction of speed and force. Then, we calculate the necessary valve
opening using the flow map of the valves. It gives the necessary valve command for
desired flow and pressure drop across the inlet and outlet port of the valve. These
are useful for cascaded control architectures or simple 2-DoF control architectures,
which are widely used in the motion control field. The controller output is fed into
either a motor speed controller embedded in the BDC servo motor driver or a valve
amplifier for proportional valves.
Fig. 8.39 Controller architecture
We experimentally evaluated the performance of the realized circuit and control

system. As an example, here we show the performance of the boost mode. Figure 8.40
is the result of a continuous press motion under 4 MPa main pressure. This experiment
starts from open circuit mode (V2 and V4 are closed) to quickly lower the slider. The
set position of the spring is 280 mm, and the desired final position is 365 mm (85 mm
spring compression). If the absolute position error is kept below 0.1 mm for 1 s, then
the slider is commanded to return to the initial position (205 mm in this experiment).
The maximum speed reads approximately 90 mm/s for extension and 80 mm/s for
retraction (137 mm/s for spring back).
In this example, when the slider passes 290 mm, the controller switches into
boosting mode, that is, activates the servo motor. However, the servo motor cannot
exert force because there is no back pressure at that moment. Actually, one can see
from the figure that the slider speed does not decrease. As the slider presses the spring,
the speed is slowed down. Then, the outlet flow from the servo-pump catches up the
speed, and this allows the servo pump to boost the pressure accordingly. That is,
this “autonomous” transition is available when the positive load is given externally.
The position error is converged below ±0.02 mm very quickly, where the pressure
is boosted from 4 MPa to 8 MPa (third panel). The convergence strongly depends on
the controllability of the rotation speed (bottom panel) of the servomotor.
8.5.5 Summary
We presented a novel hydraulic circuit, called Hydraulic Hybrid Servo Booster

(H2SB), and its application to a slider testbed with a single-rod cylinder. The exper-
imental results show that the proposed circuit can not only achieve the high speed,
but also exclusively generate large piston force while achieving the small positioning
error up to the resolution of the servo-pump with respect to the cylinder volume.
400
Act Des
Position [mm]
350
300
250
200
0 5 10 15 20 25 30 35 40 45
6000
Force [N]
4000
2000
0
0 5 10 15 20 25 30 35 40 45
10 Pa Pb Ps
Pressure [MPa]
0
0 5 10 15 20 25 30 35 40 45
4000
Speed [rpm]
3000
2000
1000
0
-1000
0 5 10 15 20 25 30 35 40 45
Time [s]
Fig. 8.40 Position tracking experiment with open circuit and boosting modes at 4 MPa supply
pressure; Time evolution of the slider position, (calculated) force, pressure, motor speed are depicted
Based on this success, currently we are trying to apply H2SB to a hydraulic manip-
ulator prototype (Fig. 8.41), where we are investigating the dynamic and compliant
joint motion control as the authors studied using servovalve-controlled robots [11].
Actually, the items (F8) and (F9) listed in Sect. 8.5.3 are the attractive functions of
H2SB. Using passive back-drivability of EHA [15], we are trying to achieve good
joint torque controllability at high load conditions. Some promising experimental
results can be found in our conference papers [9, 10].
The new circuit is applicable, in principle, to any hydraulic robots as long as
their joints are driven by conventional hydraulic cylinders or motors via hydraulic
hoses and pipes. Our future work includes: (1) Deriving optimal control algorithms
to obtain the best control performance for given robots and the motion control tasks;
(2) Optimal system design based on the deep insight into the components.
Joint 1
Joint 2
Joint 3
M SM SM SM
Common pressure line
Fig. 8.41 A three-joint manipulator prototype equipped with a novel Hydraulic Hybrid Servo
Booster circuit
8.6 Applications to Tough Robots
As mentioned in Sect. 8.2, low-friction and high-power actuators were developed.

We applied these developed actuators to a hydraulic tough robot hand, and installed
the hand in a dual arm construction robot (DACR, shown in Fig. 8.42) being operated
in the ImPACT-TRC project. This section presents the outline of this hydraulic tough
hand, and the test results pertaining to its function as an end effector of the lower
arm of DACR.
The assigned mission of the DACR is, in one aspect, to work as a powerful
and effective machine on disaster sites, and in another aspect, to work as a useful
construction robot on construction sites. To fulfil these multiple demands, we decided
to equip our hydraulic tough hand with the unique feature of dual-mode functions:
Fig. 8.42 Dual arm construction robot

The robotic hand has both a hand-mode function and a bucket-mode function, which
implies that this hand can be utilized not only as a hand but also as a bucket. One of
the most important issues in designing the hand for this application is that the hand
should be as lightweight as possible, because it is intended to be attached at the tip
of the arm of a construction robot, and a heavier hand could significantly deteriorate
the dynamic characteristics of the arm. Another particularly important issue is the
desired simplification. In an actual operational situation, the operating condition of
the robotic hand is the same as that for the existing construction machines. This means
that the hand should be able to work in wet and/or dry mud, or gravel, and should be
able to grasp rocks, timbers, etc., without experiencing technical faults. Considering
these heavy-duty conditions, we highly prioritized simplicity in the design of not
only the structure of the hand, but also the hydraulic and electric control systems.
Finally, the design priority pertained to weight saving and simplification.
8.6.1 Structure of Hand
Figure 8.43 shows the appearance of the tough hand. This hand has four fingers.
The two fingers on the outer side can rotate around their vertical axes at their roots.
The root of each finger is supported by a pair of tapered-roller bearings mounted in
overhang plates on the palm box. Each rotating finger is driven by a vane-type swing
motor placed on the upper side of the upper overhanging plate. Between the upper
and lower overhanging plates, a 4-port swivel joint is located, of which, only two
ports are in this application. The two fingers on the inner side are fixed to the palm
box at the finger roots. In this paper, the outer-side fingers and inner-side fingers are
termed as “rotating fingers” and “fixed fingers,” respectively. As mentioned above,
the hand has two operation modes, and the change in the mode, or in other words,
Fig. 8.43 Hydraulic tough hand

Fig. 8.44 Hand mode and bucket mode of the hydraulic tough hand
the “transformation”, is carried out by rotating the two rotating fingers around their
vertical spinning axes in the opposite direction by approximately 160◦ , as shown in
Fig. 8.44. In the hand mode, the rotating fingers and the fixed fingers face each other,
but in the bucket mode they line up laterally.
In the bucket mode, in order to form a concave shape, all the root joints are
wide open, and the two tip joints of the fixed fingers are closed. In this case, the
opening angle of the fixed fingers is designed to be larger than that of the rotating
fingers. Furthermore, the rotating fingers are designed to be shorter than the fixed
fingers so that all the four fingertips can be arranged in line to ensure satisfactory
bucket performance. However, this difference in the finger lengths, in turn, may
lead to degraded performance of the device as a hand. The finger lengths were thus
considered an optimal compromise.
Figure 8.45 shows the cross-sectional view of the hand. A fixed finger has three
joints, but two of these joints are interlocked by a link bar and bend simultaneously.
Thus, a fixed finger has two degrees of freedom. The rotating finger has one joint
that is located near the spin axis. In all fingers, the bending motion of one degree
of freedom is generated by one linear hydraulic actuator (cylinder). All the linear
actuators employed this hand have the same specification. Hereafter, in some cases,
this linear actuator is called a bending (control) actuator. Several of the parts adopted
in this hand, such as the linear actuators, swing motors, and swivel joints have been
newly developed. As is described in Sect. 8.2, the linear actuator has exceptionally
low internal friction, resulting in a rather small hysteresis of the finger bending
motion. The internal parts of the fingers, including hoses, should not be exposed to
Fig. 8.45 Cross-section of tough hand
the external environment such as mud or gravel. To satisfy this requirement, mutually
sliding portions around the joints are covered with double walls. A cleat is attached
on each fingertip. This cleat is a half-circle disk with a shark-tooth outer profile that
helps in generating sufficient friction forces when holding or pinching heavy objects,
and the cleat is immensely helpful to perform stable holding..
8.6.2 Hydraulic Control System
8.6.2.1 Pressure, Bending, and Rotation Controls
Figure 8.46 shows the hydraulic control system. Based on the previously mentioned
concept, this hand does not have any sensor in the fingers or the palm section. The
only sensor in the system is a pressure sensor attached to the valve block placed
separately from the hand. The hydraulic control system consists of four servo valves,
one pilot check valve, and one 1:1 flow dividing valve. The four servo valves are
attached on a valve block. The pilot check valve and flow divider are attached on the
front of the palm box. Pressure supply and oil return are from and to the robot main
body, respectively. All the pressures of the linear actuators (bending actuators) are
controlled by one servo valve, which is used as a pressure reducing valve, and the
output of the pressure sensor is used as the feedback signal. The output pressure of
Fig. 8.46 Hydraulic control system
this valve is introduced to the bending actuators through two bending control (finger
open and close) valves. One valve controls the bending of the root joints of all the
four fingers simultaneously, and the other controls the bending of the tip joints of the
fixed fingers. Owing to this simple configuration, all the operating pressures of the
bending actuators are the same, and the grasping force of the hand is controlled by
this pressure. In the hand-mode, the tip joints are usually kept open, and only the root
joints are manipulated. Here, as can be noted, the independent control of each finger is
not possible; however, this does not significantly adversely influence the grasping or
holding function. The hand can pinch or grasp objects of various shapes and sizes. As
for swing motors, the operating pressure is directly introduced from the main robot
body through one rotation control valve. Therefore, both the rotating fingers swing
simultaneously. As mentioned before, the mode change is performed as turning the
rotating fingers using the swing motors. The turning motion of the fingers in both
the directions (clockwise and counterclockwise) is limited by mechanical stoppers
attached onto the rotating fingers; thus, the configurations of the fingers in both the
hand mode and bucket mode are fixed. And the turning motion of the rotating fingers
cannot be stopped at intermediate position.
8.6.2.2 Adoption of Servo Valve
In this system, servo valves are adopted for bending control and rotation control.
However, they are used as three way flow direction control valves. Overlaps of the
main spools of these servo valves are selected optimally to facilitate this special
function. There are mainly two reasons why servo valves are used instead of the
usual and less expensive flow direction control valves. One is the limitation of the
allowable current capacity supplied by the robot main body. The other reason is
the limited space available for valves. Fortunately, a very small servo valve, which
consumes only a small current (∼30 mA), was available. Therefore, we adopted servo
valves for this function.
8.6.2.3 Adoption of Pilot Check Valve and Flow Divider
In the hand mode, owing to the adopted simple hydraulic system, the motions of
the fixed fingers and those of the rotating fingers are not necessarily synchronized
because of the difference in their masses and internal frictional forces. To minimize
the non-synchronized motion, a 1:1 flow dividing valve is adopted, which attempts
to equalize the volume flow to the actuators (controlling the closing speed of the
fingers) of the fixed fingers and to the rotating fingers. Furthermore, the pinching
function can possibly turn unstable owing to the lack of the control of the position
or angle of the finger joints. In order to increase the stability, as shown in Fig. 8.46, a
pilot check valve is introduced, which functions as a sort of ratchet around the finger
joint axis, and is helpful for enabling stable pinching.
8.6.3 Electric Control System
Figure 8.47 shows the electric control system. This construction robot is manipulated
via a wireless remote-control system. The operator is in a cockpit, which is located
at a distance from the construction robot, and he/she handles a specific device to
control the robot, with support provided by a co-operator. (The details are described
in Chap. 5.) The operation commands for the hand are also sent from the cockpit.
The hand control box receives these commands through the robot’s main controller,
and controls the hand. The main operation commands are as follows: (1) hand-mode
Fig. 8.47 System configuration for hand control

Fig. 8.48 Finger control in hand mode
command, (2) bucket-mode command, (3) root joint closing (pinch or grasp) and
opening (free) with grasping force control (hand-mode, all fingers), (4) tip joint
opening and closing (fixed fingers), (5) reset, (6) setting hand shape with all fingers
at fully open positions in the hand mode. Under command (3) mentioned above, the
fingers are controlled in the following manner (Fig. 8.48): The opening or closing
of the fingers is controlled by a type of rotary slider manipulated by the gripping
motion of the operator’s finger from the cockpit. The slider has a neutral position
with a dead zone in the middle of the slide range. When the slider is in the dead
zone, the bending control valve is kept neutral, and the operating pressure is at its
minimum value. When the slider is moved forward (gripping direction) and it goes
out of the dead zone, the bending valve controls the actuators to close the fingers,
and the operating pressure is increased as the slider leaves the neutral zone. When
the slider position is near the boundary of the neutral zone, the pressure is low (3–
6 MPa), which makes it possible to close the fingers gradually. When the slider is at
the dead end, the pressure is at its maximum value (21 MPa). The finger motion and
grasping force is thus controlled. When the slider is moved backwards, the fingers are
controlled to open, and in this case the pressure is also increased as the slider leaves
the neutral zone. in the manner mentioned previously. Thus, the grasping or pinching
motion is controlled. The slider is acted upon by a reaction force in accordance with
the operating pressure command to enable easier manipulation. In the bucket mode
(2), none of the bending and rotation commands are valid.
The control box was specifically designed and manufactured for the dual arm
construction robot. As is the case with the valve block, only a narrow space is available
for the hand control box. A special hand controller board is designed that includes
the servo valve driver circuit. Owing to this design, the control box can be realized
as it is sufficiently small (size: 100 mm × 40 mm × 76 mm) to be installed on the
arm next to the valve block (see Fig. 8.51).
Fig. 8.49 Tough hand pinching several types of objects in hand mode
8.6.4 Test for Hand Function as an End-Effector of

Construction Robot Arm
8.6.4.1 Grasping or Pinching Performance Test
Before installing to the dual-armed robot, the hand was attached to a single-armed
robot, and its grasping and pinching performance was tested as illustrated in Fig. 8.49.
The tested objects, listed in Table 8.7, were selected so that a wide range of charac-
teristics such as light or heavy weight, small or large size, soft or hard nature, short or
long length, and round or square shape, could be covered. The procedure to estimate
the performance is as follows: First, the object lying on the ground is grasped, and
lifted up to a height of approximately 1 m. Then, the hand with the object is pitched
upward (usual bucket dump motion) by approximately 90◦ and returned to the orig-
inal hanging position. Subsequently, the object is placed down on the ground. When
all these steps are performed sequentially in a stable condition, the performance is
estimated to be “good,” otherwise it is “bad.” All results obtained considering the
selected objects corresponded to good performance.
8.6.4.2 Gravel Scooping Test
The performance of the hand as a bucket was estimated in a task of scooping gravel.
Figure 8.50 shows the images of the test. Gravel on the ground was scooped in the
bucket and extracted. Although no critical problems were noted in these processes
and the results were satisfactory, some aspects that could be improved in the future
were highlighted. For instance, it is preferable to increase the bucket capacity.
Table 8.7 Tested objects

Object Size (mm) Weight (kg) Estimation Remarks
Rock 400 × 200 × 200 20 Good Heavy
Concrete block 390 × 100 × 190 8 Good Heavy
Tetra pod 450 × 450 × 450 40 Good Heavy, Irregular shape
Metal can Φ300 × L360 2 Good Large diameter, vertical
Fat paper pipe Φ300 × L1260 4 Good Large diameter, horizontal
Thin paper pipe Φ85 × L1200 1 Good Small diameter, horizontal
Long plastic pipe Φ220 × L2000 13 Good Long, horizontal
Short plastic pipe Φ220 × L230 1 Good Vertical
Tire Φ580 × H 135 5 Good Elastic body
Fig. 8.50 Scooping of gravel using the tough hand in bucket mode
8.6.4.3 Performance as an End Effector of Dual Armed Robot
The hydraulic tough hand was finally attached to the tip of the lower arm of the dual
armed construction robot as shown in Fig. 8.51. The hand is attached by two fat pins
in the same way as a usual bucket. The valve block is attached on the intermediate
roll unit. The hand and the valve block are connected through six hoses. Two hoses
from the main body of the robot are connected to the P-port and the T-port of the
valve block. The hand was tested in the Fukushima Robot Test Field, which includes
mockups of several types of disaster sites. The task of the dual armed construction
robot related to the proposed tough robot hand was as follows (Fig. 8.52): (1) Remove
the gravel that covers the roof of a collapsed house by using the tough hand in the
bucket mode; (2) Lift one portion of the crashed roof, which weighs approximately
100 kg in the hand mode; (3) Keeping the roof lifted up, place a jack in the space
between the roof and the ground to ensure clearance for the rescue personnel to enter.
Placing a jack requires operation of the upper hand. This task also demonstrates the
co-operation of the upper hand and the lower hand. The test results were satisfactory,
and they demonstrate the effectiveness of the developed dual arm construction robot
and our hydraulic tough hand. The findings are encouraging for the development of
the robot of this direction.
Fig. 8.51 Tough hand attached to the tip of the lower arm of DACR
Fig. 8.52 Hand removing gravel from a collapsed rashed roof in the bucket mode (left) and lifting
it up in hand mode (right) in the mockup of a disaster site
Acknowledgements This research was funded by ImPACT Tough Robotics Challenge Program of
Council for Science, Technology and Innovation (Cabinet Office, Government of Japan). The authors
thank Yuken Kogyo Co., Ltd., KYOEI INDUSTRIES.CO., LTD., Pneumatic Servo Controls LTD.,
KAWAMOTO HEAVY INDUSTRIES co., ltd., MARUZEN KOGYO CO., LTD., ONO-DENKI
CO., LTD., Weltec-sha Inc. Ltd., Hydraulic Robots Research Committee, Tokyo Keiki Inc., Fine
Sinter Corp., Takako Inc., and Mori Kogyo, Ltd. for their support.
References
1. Fujita, S., Baba, K., Sudoh, D.: RESCUE ROBOT ‘T-52 ENRYU’. In: International Symposium
on Automation and Robotics in Construction, Tokyo, Japan (2006)
2. Hashimoto, K., Kimura, S., Sakai, N., Hamamoto, S., Koizumi, A., Sun, X., Matsuzawa,
T., Teramachi, T., Yoshida, Y., Imai, A., et al.: WAREC-1-a four-limbed robot having high
locomotion ability with versatility in locomotion styles, in 2017 IEEE International Symposium
on Safety, Security and Rescue Robotics (SSRR 2017), pp. 172–178 (2017)
3. Hirooka, D., Suzumori, K., Kanda, T.: Flow control valve for pneumatic actuators using parti-
cle excitation by PZT vibrator. Sens. Actuators A Phys. 155, 285–289 (2009)
4. Hirooka, D., Suzumori, K., Kanda, T.: Design and evaluation of orifice arrangement for parti-
cle-excitation flow control valve. Sens. Actuators A Phys. 171, 283–291 (2011)
5. Hirooka D., Yamaguchi T., Furushiro N., Suzumor K., Kanda T.: Research on controllability
of the particle excitation flow control valve. In: Proceedings of 6th International Conference
on Manufacturing, Machine Design and Tribology, pp. 136–137 (2015)
6. Hirooka D., Yamaguchi T., Furushiro N., Suzumor K., Kanda T.: Highly responsive and stable
flow control valve using a PZT transducer. In: Proceedings of IEEE International Ultrasonics
Symposium, 6H-6 (2016)
7. Hirooka, D., Yamaguchi, T., Furushiro, N., Suzumor, K., Kanda, T.: Particle-excitation flow-
control valve using piezo vibration-improvement for a high flow rate and research on control-
lability. IEEJ Trans. Sens. Micromachines 137, 32–37 (2017)
8. Hyon, S., Mori, Y., Mizui, H.: Hydraulic drive circuit. US Patent 9458864
(PCT/JP2013/069900) (2016)
9. Hyon, S., Tanimoto, S.: Joint torque control of a hydraulic manipulator with a hybrid servo
booster. In: The Tenth JFPS International Symposium on Fluid Power, 1C11 (2017)
10. Hyon, S., Tanimoto, S., Asao, S.: Toward compliant, fast, high-precision, and low-cost manip-
ulator with Hydraulic Hybrid Servo Booster. In: IEEE International Conference on Robotics
and Automation, Singapore, 30 May, pp. 39–44 (2017)
11. Hyon, S.: A motor control strategy with virtual musculoskeletal systems for compliant anthro-
pomorphic robots. IEEE/ASME Trans. Mechatron. 14(6), 677–688 (2009)
12. Ishii, A.: Operation system of a double-front work machine for simultaneous operation. In:
International Symposium on Automation and Robotics in Construction, Tokyo, Japan (2006)
13. Jansson, A., Palmberg, J.: Separate controls of meter-in and meter-out orifices in mobile hy-
draulic systems. SAE Trans. 99(2), 377–383 (1990)
14. JPN Miniature Hydraulic Cylinders. http://www.j-p-n.co.jp/english/eng_index.html
15. Kaminaga, H., Otsuki, S., Nakamura, Y.: Development of high-power and backdrivable linear
electro-hydrostatic actuator. In: IEEE-RAS International Conference on Humanoid Robots,
pp. 973–978 (2014)
16. Kanda T., Osaki H., Seno N., Wakimoto S., Ukida T., Suzumori K., Nabae H.: A small three-
way hydraulic valve using particle excitation controlled by one piezoelectric transducer. In:
Proceedings of 16th International Conference on New Actuators, pp. 442–445 (2018)
17. Kuribayashi, K.: Criteria for evaluation of new robot actuators as energy converters. J. RSJ
7(5), 35–43 (1998)
18. MOOG, Electrohydrostatic actuators. http://www.moog.com/products/actuators-
servoactuators/actuation-technologies/electrohydrostatic/
19. Mori, M., Tanaka, J., Suzumori, K., Kanda, T.: Field test for verifying the capability of two high-
powered hydraulic small robots for rescue operations. In: IEEE/RSJ International Conference
on Intelligent Robots and Systems, pp. 3492–3497 (2006)
20. Mori, M., Suzumori, K., Seita, S., Takahashi, M., Hosoya, T., Kusumoto, K.: Development of
very high force hydraulic McKibben artificial muscle and its application to shape-adaptable
power hand. In: IEEE International Conference on Robotics and Biomimetics, pp. 1457–1462
(2009)
21. Morita, R., Nabae, H., Suzumori, K., Yamamoto, A., Sakurai, R.: Development of hydraulic
McKibben artificial muscle. In: The 34th Annual Conference of the Robotic Society of Japan,
Proceedings of the 34th Annual Conference of RSJ, RSJ2016AC3C3-01 (2016)
22. Morita, R., Suzumori, K., Nabae, H., Endo, G., Sakurai, R.: Concrete chipping by antago-
nistic drive of hydraulic artificial muscles. In: Proceedings of the Robotics and Mechatronics
Conference 2017 (2017)
23. Morita, R., Nabae, H., Endo, G., Suzumori, K.: A proposal of a new rotational-compliant joint
with oil-hydraulic McKibben artificial muscles. Advanced Robotics, vol. 32, pp. 511–523.
Taylor and Francis, Boca Raton (2018)
24. Morita, R., Nabae, H., Endo, G., Suzumori, K.: 3 DoF wrist mechanism for tough robots by
hydraulic artificial muscles. In: Proceedings of the Robotics and Mechatronics Conference
2017 (2018)
25. Nabae, H., Hemmi, M., Hirota, Y., Ide, T., Suzumori, K., Endo, G.: Super-low friction and
lightweight hydraulic cylinder using multi-directional forging magnesium alloy and its appli-
cation to robotic leg. Adv. Robot. 32(9), 524–534 (2018)
26. Opdenbosch, P., Sadegh, N., Book, W,. Enes, A.: Auto-calibration based control for inde-
pendent metering of hydraulic actuators. In: IEEE International Conference on Robotics and
Automation, pp. 153–158 (2011)
27. Osaki, H., Kanda, T., Ofuji, S., Seno, N., Suzumori, K., Ukida, T., Nabae, H.: A small three-
way valve using particle excitation with piezoelectric for hydraulic actuators. Adv. Robot. 32,
500–510 (2018)
28. Raibert, M., Blankespoor, K., Nelson, G., Playter, R.: Bigdog, the rough-terrain quadruped
robot, In: Proceedings of 17th World Congress The International Federation of Automatic
Control, Seoul, Korea (2008)
29. Rydberg, K.: Energy efficient hydraulic hybrid drives. In: 11th Scandinavian International
Conference on Fluid Power (SICFP’09), Linköping, Sweden (2009)
30. Schulte, H.F.: The Characteristics of the McKibben Artificial Muscle. The Application of
External Power in Prosthetics and Orthotics (Appendix H). National Academy of Sciences,
vol. 874, pp. 94–115. National Research Council, Washington D.C. (1961)
31. Stecki, J., Matheson, P.: Advances in automotive hydraulic hybrid drives. In: Sixth JFPS Inter-
national Symposium on Fluid Power, Tsukuba, Japan, pp. 664–669 (2005)
32. Suzumori, K., Faudzi, A.A.: Trends in hydraulic actuators and components in legged and tough
robots: a review. Adv. Robot. 32(9), 458–476 (2018)
33. Tanaka, J., Suzumori, K., Takata, M., Kanda, T., Mori, M.: A mobile jack robot for rescue
operation. In: Security and Rescue Robotics (SSRR2005), pp. 99–104 (2005)
34. Tatsumi M., Izusawa K., Hirai S.: Miniaturized unconstrained valves with pressure control
for driving a robot finger. In: Proceedings of IEEE International Conference on Robotics and
Biomimetics, pp. 1528–1533 (2011)
35. Ukida, T., Suzumori, K., Nabae, H., Kanda, T., Ofuji S.: A small water flow control valve using
particle excitation by PZT vibrator. In: Proceedings of 6th ICAM, pp. 221–222 (2015)
36. Ukida, T., Suzumori, K., Nabae, H., Kanda, T., Ofuji, S.: Hydraulic control by flow control
valve using particle excitation. JFPS Int. J. Fluid Power Syst. 10, 38–46 (2017)
37. Ukida T., Suzumori K., Nabae H., Kanda T.: Analysis of flow control valve in hydraulic system
using particle excitation. In: Proceedings of The 10th JFPS International Symposium on Fluid
Power, 2C12 (2017)
38. Yao, B., Liu, S.: Energy-saving control of hydraulic systems with novel programmable valves.
In: 4th World Congress on Intelligent Control and Automation, Shanghai, pp. 3219–3223
(2002)
Chapter 9
Simulator for Disaster Response Robotics
Fumio Kanehiro, Shin’ichiro Nakaoka, Tomomichi Sugihara,

Naoki Wakisaka, Genya Ishigami, Shingo Ozaki and Satoshi Tadokoro
Abstract This chapter presents a simulator for disaster response robots based on the
Choreonoid framework. Two physics engines and a graphics engine were developed
and integrated into the framework. One physics engine enables robust contact-force
computation among rigid bodies based on volumetric intersection and a relaxed
constraint, whereas the other enables accurate and computationally efficient com-
putation of machine–terrain interaction mechanics based on macro and microscopic
approaches. The graphics engine allows simulating natural phenomena, such as rain,
fire, and smoke, based on a particle system to resemble tough scenarios at disas-
ter sites. In addition, wide-angle vision sensors, such as omnidirectional cameras
and LIDAR sensors, can be simulated using multiple rendering screens. Overall, the
simulator provides a tool for the efficient and safe development of disaster response
robots.
F. Kanehiro (B) · S. Nakaoka

National Institute of Advanced Industrial Science and Technology (AIST), Tsukuba Central 1,
1-1-1 Umezono, Tsukuba, Ibaraki 305-8560, Japan
e-mail: f-kanehiro@aist.go.jp
S. Nakaoka
e-mail: s.nakaoka@aist.go.jp
T. Sugihara · N. Wakisaka
Osaka University, 2-1 Yamadaoka, Suita, Osaka 565-0871, Japan
e-mail: zhidao@ieee.org
N. Wakisaka
e-mail: naoki.wakisaka@ams.eng.osaka-u.ac.jp
G. Ishigami
Keio University, 3-14-1 Hiyoshi, Kohoku, Yokohama 223-8522, Japan
e-mail: ishigami@mech.keio.ac.jp
S. Ozaki
Yokohama National University, Tokiwadai 79-5, Hodogaya-ku, Yokohama 240-8501, Japan
e-mail: s-ozaki@ynu.ac.jp
S. Tadokoro

https://doi.org/10.1007/978-3-030-05321-5_9
454 F. Kanehiro et al.
Operation
training
Controller
Robot design Deployment
development
Workplan
validation
Fig. 9.1 Stages of development and deployment of disaster response robots. Simulators can reduce
the required cost and time for each of these stages and the complete process
9.1 Introduction
Simulators can be considered as essential tools to develop disaster response robots

safely and efficiently, as illustrated in Fig. 9.1, where the development process from
design to deployment can be enhanced by using simulators:
Robot design We can reduce prototyping cost and time by building and evaluating
robots through simulations.
Controller development Simulating the development of a robot before its man-
ufacturing prevents situations such as damaging the robot and its surroundings
due to bugs in the underlying controllers. Moreover, despite having only few real
robots available, many developers can work on the controllers simultaneously.
Operation training Like in the previous stage, simulation prevents a robot and
its soundings from being damaged by novice operators. Furthermore, simulations
allow to conduct extreme tests that are not possible on real robots. Moreover,
despite having only few real robots available, many operators can simultaneously
train on simulation environments.
Workplan validation When we know in advance the characteristics of a mission
environment, we can validate workplans through simulations and reduce the prob-
ability of failure caused by mismatch between the mission requirements and robot
capabilities.
Several open-source and commercial simulation environments are available, such
as Gazebo [27], V-REP [7], Vortex [6], and our simulation environment, Choreonoid.
However, none of them can simulate all physical phenomena emerging in the real
world accurately and efficiently. In fact, simulators present some limitations that can
be summarized as follows.
• There is tradeoff between computation speed and accuracy. For instance, to achieve
interactive simulations (e.g., operation training), we usually compromise accuracy.
• Some types of mechanisms and physical phenomena cannot be simulated.
• It is difficult to simulate soft materials and fluids within a reasonable computation
time.
Considering these limitations, we developed a simulator that is specific for legged
and construction robots. This type of simulator is valuable given the high cost of
9 Simulator for Disaster Response Robotics 455
Fig. 9.2 Diagram of Robot/Environment models

proposed disaster response
robot simulator
Physics Graphics
Engine Engine
(Sect. 9.2, Sect. 9.3) (Sect. 9.4)
Simulator Operator Simulator Framework (Sect. 9.1.1)

Sensor information Control commands
Control Program
Robot Operator
manufacturing many copies and the risk associated to the large weight and size of
such robots.
Like other simulators, the one we propose basically has the structure shown in
Fig. 9.2 consisting of the following three main components.
Physics engine This component applies the commands given by controllers to
robots in the simulation and computes the corresponding modification of the
simulation world after a sampling period. Angle sensors, IMUs, and force/torque
sensors can be simulated by retrieving the results of the computation.
Graphics engine This component allows to visualize the simulation through 3D
graphics and provides rendered images to the user. In addition, RGB cameras,
RGBD cameras, and LIDAR sensors can be simulated from the rendered images.
Simulator framework This component initializes the physics and graphics
engines by reading model files that define robots and the environment. In addition,
it manages the information exchange between the engines and robot controllers
and defines a user interface.
The user needs to provide the robot and environment models as well as the robot
controllers to the simulator. These models contain specific geometric, dynamics, and
kinematics properties. The controllers receive sensor information from the simulation
environment and deliver commands obtained from either algorithms or teleoperation
equipment.
Like simulators, a myriad of open-source and commercial physics engines are
available, including ODE [26], Bullet [5], PhysX [25], and AgX Dynamics [1],
which mainly compute rigid body dynamics. However, efficient contact-force simu-
lation among rigid bodies with arbitrary shapes is still an open problem. Moreover,
some physics engines support only primitive shapes (e.g., spheres, cubes) to guar-
antee efficient and robust computation. Even if an engine supports polygon meshes,
simulation outcomes can be inaccurate, and simulating nonrigid objects such as soil
within reasonable computation time and accuracy is a still very challenging. In con-
trast, at disaster sites, robots with a variety of shapes may engage in contact with
complex-shaped debris, move on roads covered by soil and perform excavation oper-
ations. Considering such scenarios, we have developed physics engines capable of
simulating these phenomena, as detailed in Sect. 9.2 and Sect. 9.3.
At disaster sites, phenomena such as smoke, mist, and fire can disturb the robot
visual sensing. To verify the proper operation of a disaster response robot under this
type of tough environment, the simulator should resemble these phenomena. Many
disaster response robots currently use wide-angle vision sensors such as omnidirec-
tional cameras and LIDAR sensors, and hence those sensors must be also simulated.
To this end, the graphics engine presented in Sect. 9.4 has been developed.
Overall, the abovementioned physics engines have specific capabilities and limi-
tations, and thus no physics engine can outperform others in every aspect. In addition,
users employ different software platforms, such as the widely used OpenRTM [2]
and ROS [28]. Consequently, a simulation framework must be flexible and extensible
for users to freely choose the engines and platform of their preference. To develop
the proposed disaster response robot simulator, we employed the Choreonoid [21]
simulation framework, which is described in the sequel.
9.1.1 Choreonoid Simulation Framework
Choreonoid is an integrated software intended to simulate general-purpose robots

and has been publicly available since 2011 as open-source software. Figure 9.3
shows a screenshot of Choreonoid while simulating a disaster response robot with
continuous tracks that monitors and operates inside a plant on fire. Choreonoid pro-
vides a graphical user interface, which allows multiple views embedded in the main
Fig. 9.3 Screenshot of Choreonoid. The simulation shows a disaster response robot driven by con-
tinuous tracks. The top-right area of the screen displays the simulation as a 3D computer animation,
whose viewpoint can be freely changed. The two central panels show the simulated camera images
and LIDAR measurements
window. In the top-right panel of the screen, the state and movement of robots and
the objects in the scene are displayed as an animation using 3D computer graphics.
The robot motion and physics of the objects in the simulation are calculated by algo-
rithms based on physical models to resemble the behaviors emerging in real world.
Choreonoid is capable of simulating various types of robots including manipulators,
legged robots, wheeled robots, continuous-track robots, and even multicopters. Note
that continuous tracks represent a mechanism that is difficult to simulate, but the
latest version of Choreonoid suitable resembles the mechanism behavior by relying
on the commercially available AGX Dynamics physics engine [1]. The models in
Choreonoid are described by model files, thus allowing the creation and simulation
of any robot, provided that its mechanism is supported by the simulator. Furthermore,
Choreonoid can simulate sensors mounted on a robot, such as cameras and LIDAR
sensors, whose outcomes are shown in the central panels of the screenshot in Fig. 9.3.
Choreonoid provides a highly extensible architecture, and new functions can be
easily added and integrated as plugins. This extensibility is the basis for the proposed
disaster response robot simulator, which extends the scope of Choreonoid. Likewise,
Choreonoid can implement and test any added function and method with minimal
effort. Thus, the platform can help to efficiently develop fundamental technologies
related to robot simulation. Moreover, the several existing methods on Choreonoid
can be easily compared with newly developed ones. Overall, the extensibility of
Choreonoid allows to implement not only simulation functions but also a wide range
of functions related to robot operation. For instance, high-level teleoperation inter-
faces have been embedded as Choreonoid functions [23]. The interfaces were used at
the disaster response robot competition DARPA Robotics Challenge Finals. In this
way, Choreonoid can be used in various ways as a software platform for developing,
testing, and operating disaster response robots.
9.2 Stable Forward Dynamics Computation Based

on Loosely Constrained Volumetric Contact
Contact-force computation is crucial for dynamics simulations. While the resistive

effect against microscopic deformation of bodies produces this force. The rigid-body
approximation, which is usually employed for robotic system models, does not reflect
such deformations. Thus, we should (i) reproduce the neglected deformation and (ii)
estimate the force that compensates it.
To address problem (i), the normal and tangential directions along which the
restitution and friction forces respectively act are commonly estimated from local
interpenetrations among pairs of facets of colliding bodies. However, the normal
direction is statistically determined from the solid deformation, whereas neither the
facets nor the vertices have volume. Consequently, this volume-free method can lead
to inaccurate estimations [34]. In contrast, determining the normal direction from the
volumetric intersection of bodies [9] provides a reasonable estimation of the normal
direction.
Problem (ii) can be addressed using penalty-based methods that require estimation
within small intervals to finely track the evolution of microscopic deformations, so
that they increase the total computation time. Alternatively, a macroscopic model [4,
14, 17, 22] can inversely estimate the impulse exerted on bodies from a preferred
constraints on velocity and enables stable simulations with larger intervals. Never-
theless, this approach often produces chattering in the computed contact forces.
To overcome these limitations, a novel method to compute contact forces from
volumetric intersections and a relaxed constraint [32] is described in this section.
Some case studies have shown that the proposed method provides more realistic
results than conventional methods. Besides those presented here, additional details
on the proposed method are available in a conference paper [33].
9.2.1 Contact-Force Computation from Contact Volume
Suppose body shapes are modelled by convex polyhedra. Hence, the contact solids,
i.e., the intersection of pairs of colliding bodies, are also convex polyhedra. Figure 9.4
illustrates restitution direction ni of the ith contact solid, Vi , resulting from the
collision between bodies BiA and BiB . The restitution direction can be expressed as
⎛ ⎞

ni = norm ⎝ Si, j ni, j − Si, j ni, j ⎠, (9.1)
P i, j ∈L i,B P i, j ∈L i,A
where Pi, j is the jth plane of Vi , Si, j is the area of Pi, j , ni, j is the normal vector
to Pi, j , Li,∗ (*∈ {A, B}) is the set of planes of colliding bodies Bi∗ in Pi, j , and
function norm(x) for arbitrary 3D vector x is defined as
def
norm (x) = x/x. (9.2)
The restitution direction indicates the direction that minimizes the volume of Vi when
BiA and BiB infinitesimally move apart from each other along that direction. Contact
plane Pi is defined such that it passes the barycenter, pi0 , of Vi with normal vector
ni assuming bodies with equal stiffness.
Fig. 9.4 Normal vector to

contact solid
The contact force is the integral of the stress distributed over the contact plane.
Let the stress on Pi be denoted as f i ( p), where p ∈ Pi is the position vector of
any contact point. The equation of motion of a robot modelled as a rigid multibody
is given by

H q̈ + b = τ + J i ( p)T f i ( p)ds, (9.3)
i p∈P i
where q is the joint displacement, H is the inertia matrix, b accounts for centrifugal,
Coriolis, and gravitational forces, τ is the joint actuation force, J i ( p) is the Jacobian
matrix that transforms an infinitesimal joint displacement into the corresponding
infinitesimal variation of p, and ds is an infinitesimal area around p. The following
constraints are imposed to any p on Pi :
( f in ( p) ≥ 0 ∧ vin ( p) = 0) ∨ ( f in ( p) = 0 ∧ vin ( p) ≥ 0), (9.4)

( f is ( p) ≤ μs f in ( p) ∧ vis ( p) = 0) ∨
, (9.5)
( f is ( p) = −μk f in ( p) norm vis ( p) ∧ vis ( p) = 0)
def def
where vi ( p) is the relative velocity of p, f in ( p) = niT f i ( p), f is ( p) = f i ( p) −
def def
f in ( p)ni , vin ( p) = niT vi ( p), vis ( p) = vi ( p) − vin ( p)ni , μs is the static friction coef-
ficient, and μk is the kinetic friction coefficient. Equation (9.4) represents the com-
plementarity between normal force and velocity, whereas Eq. (9.5) represents the
Coulomb’s friction law. The computation of contact forces consists of finding every
f i ( p) that satisfies (9.4) and (9.5) under Eq. (9.3).
This problem can be reformulated regarding the resultant force and relative six-
axis velocity v̂iAB = v̂iB − v̂iA from BiA to BiB , where
def vi∗ + ( pi∗ − pi0 ) × ω i∗

v̂i∗ = , ∗ ∈ {A, B}, (9.6)
ω i∗
and vi∗ and ω i∗ are the linear and angular velocities of Bi∗ , respectively. Velocity
vi ( p) is represented as
vi ( p) = K i ( p)v̂iAB , (9.7)
def
where K i ( p) = 1 − [( p − pi0 )×] with 1 being the identity matrix and [x×] the
outer product matrix for arbitrary 3D vector x. Then, Eq. (9.3) becomes

H q̈ + b = τ + T
J iAB f̂ iAB , (9.8)
i
where J iAB is the Jacobian matrix that transforms q̇ into v̂iAB and

def
f̂ iAB = K i ( p)T f i ( p)ds. (9.9)
p∈P i
If all the contact points remain static, there exists { f i } that satisfy (9.4), (9.5) and
vi ( p) = 0, ∀ p ∈ Pi and ∀i. (9.10)
If no solution exists, some points must either slide or detach. Notice that this is an
over-constraint because the contact points on a rigid body do not move indepen-
dently. Hence, to prevent a related ill-posed problem, the constraint can be relaxed
by employing the regularization technique with compensation velocity illustrated in
Fig. 9.5:
vi ( p) = −kh i ( p)ni , ∀ p ∈ Pi and ∀i, (9.11)
where h i ( p) is the interpenetration depth of Pi into p and k is the compensation

coefficient. The sum of squared errors from Eq. (9.11) is integrated over Pi (Fig. 9.6)
as

1 1 T
Ei = vi ( p) + kh i ( p)ni 2 ds = v̂iAB Q i ( p)v̂iAB + kniT C i ( p)v̂iAB + αi ,
p∈P i 2 2
(9.12)
where

def 1 −[ pi, j 123 ×]
Q i ( p) = K i ( p) K i ( p)ds =
T
Si, j [ pi, j 12 ×]2 +[ pi, j 23 ×]2 +[ pi, j 31 ×]2 ,
p∈P i j [ pi, j 123 ×] − 3
(9.13)
Fig. 9.5 Velocity constraint

on contact plane
Fig. 9.6 Relative six-axis

velocity constraint obtained
from least squares method

def [(h p +h p +h p )×]
Ci = h i ( p)K i ( p)ds = Si, j h i, j 1 − i, j 12 i, j 12 i, j 23 i, j 23 i, j 31 i, j 31 ,
p∈P i
123 3
j
(9.14)
αi is a negligible term, Si, j is the projected area of Pi, j onto the contact plane,
def pi, j 1 + pi, j 2 + pi, j 3 def pi, j x + pi, j y

pi, j 123 = , pi, j x,y = ,
3 2
def h i, j 1 + h i, j 2 + h i, j 3 def h i, j x + h i, j y
h i, j 123 = , h i, j x,y = (9.15)
3 2
and pi, j ∗ and h i, j ∗ (∗ ∈ {1, 2, 3}) are the projected point of Pi and depth from Si
of the ∗th vertex of Pi, j , respectively. A tentative contact force can be computed by
solving the following regularized constraint:
m

1 T 1 T 1
f = arg min E i + λ f iAB f iAB = arg min v Qv + knT Cv + λ f T f ,
f 2 f 2 2
i=1
(9.16)
T T
where f = [ f̂ 1AB · · · f̂ mAB ]T , v = [v̂T1AB · · · v̂TmAB ]T , Q = diag{ Q i }, n = [nT1 · · ·
nTm ]T , C = [C T1 · · · C Tm ]T , m is the number of contact volumes, and λ is the regu-
larization factor. Equation (9.8) can be discretized as
q̇[k + 1] − q̇[k]
H[k] + b[k] = τ [k] + J[k]T f [k] (9.17)
t
where t is the integration interval and k is the discrete timestep index, i.e., t = kt,
with an assumption that variable changes, except for q, are negligibly small during
t. Then, Eq. (9.16) becomes

1 T
f [k] = arg min f A[k] f + l[k]T f , (9.18)
f 2
where
A[k] = As [k]T Q[k] As [k] + λ1,

As [k] = J[k]H[k]−1 J[k]T t,
T

l[k] = q̇[k] + H[k]−1 (u[k] − b[k]) t J[k]T Q[k] + kn[k]T c[k] As [k].
(9.19)
9.2.2 Contact-Force Correction Using Friction Constraint
Contact force f i ( p) satisfying the Coulomb’s friction law can be expressed as

π
def
f i ( p) = εi ( p, θ)φi (θ)dθ, φi (θ) = ni + μs cos θei f 1 + μs sin θei f 2 ,
−π
(9.20)
where εi ( p, θ) is a nonnegative value, and ei f 1 and ei f 2 are two tangential vectors on

the contact plane orthogonal to each other and ni . The resultant six-axis force, f̂ iAB ,
is given by
π
f̂ iAB = εi ( p, θ)K i ( p)T φi (θ)dθds. (9.21)
p∈P i −π
As the contact area is convex and p is the sum of the vertices, there always exists an
equivalent distribution of forces at the vertices to any six-axis contact force. Then,
Eq. (9.21) becomes
L
π
L
f̂ iAB = εil (θ)K i ( pil )T φi (θ)dθ = K i ( p i l )T f i l , (9.22)
l=1 −π l=1
where pil is the lth vertex on the plane, L is the number of the vertices, and
π
P−1
def 2π p
f il = εil (θ)φi (θ)dθ εil, p φi , (9.23)
−π p=0
P
where a P-order pyramidal approximation is employed and εil, p is a nonnegative

value. By replacing (9.23) into (9.22):

L
P−1
2π p
f̂ iAB = εil, p K i ( pil )T φi . (9.24)
l=1 p=0
P
If (9.24) has a solution {εil, p |εil, p ≥ 0, ∀l = 1, . . . , L , ∀ p = 0, . . . , P − 1}, f̂ iAB is

consistent with the assumption of static friction, which can be verified by solving
the linear programming problem.
If it fails to find any solution, f [k] as computed in the previous subsection violates
the assumption and must be corrected to the kinetic friction force that is parallel to
sliding velocity vis . The six-axis force under kinetic friction can be expressed as

f̂ iAB,K = εin ,K ( p)K i ( p)T ni + μk eis ( p) ds, (9.25)
p∈P i
where eis ( p) = − norm(vis ( p)) and εin ,K ( p) is a nonnegative value. A suitable

∗
{εin ,K } has to be selected from computed force f̂ iAB . The components of f̂ iAB,K
can be divided into two groups depending on either ni or eis ( p). The former group
can be transformed into the following summation:

f iAB,K n = εin ,K l
l

τiAB,K f 1 = ril f 2 εin ,K l , τiAB,K f 2 = − ril f 1 εin ,K l , (9.26)
l l
where f̂ iAB,K = [ f iAB,K T τ iAB,K T ]T , ril f 1 = eiTf 1 pil , ril f 2 = eiTf 2 pil , and εin ,K l is a
nonnegative value. The other group cannot be transformed into a summation due to
the nonlinear normalization. The proposed calculation method assumes that sliding
direction eis ( p) can be approximated as a linear combination of different eis ( pil ).
Then, the remaining components of f̂ iAB can be derived as

f iAB,K f 1 = til f 1 εin ,K l , f iAB,K f 2 = til f 2 εin ,K l ,
l l

τiAB,K n = (ri f 1 til f 2 − ri f 2 til f 1 )εin ,K l , (9.27)
l
where til f 1 = eiTf 1 eis ( pil ) and til f 2 = eiTf 2 eis ( pil ). To correct f [k], the following linear
programming problem with respect to εin ,K l has to be solved:

f iAB,K f 1 f iAB,K f 2 τiAB,K
εin ,K = arg max ∗ + ∗ + ∗ n (9.28)
εin ,K f iAB f 1 f iAB f 2 τiABn
∗ ∗ ∗
subject to f iAB,K n = f iAB n
, τiAB,K f 1 = τiAB f1
, τiAB,K f 2 = τiAB f2
, (9.29)
where εin ,K = [εin ,K 1 · · · εin ,K L ]T . The six-axis kinetic friction force is determined
by replacing Eq. (9.28) into (9.26) and (9.27).
9.2.3 Simulation
Two tests were conducted on the Choreonoid simulation framework to verify the pro-
posed method, considering μs = 0.5, μk = 0.3, λ = 0.0001 and k = 1000 for both
tests. The simulations were executed on a PC with Intel(R) Core(TM) i7-2720QM
2.20 GHz CPU and 8 GB of RAM. The Runge–Kutta–Gill method was used for
updating the calculation every t = 0.001 s.
The first test consisted of a manipulator PA-10 (Mitsubishi Heavy Industries, Ltd.)
picking up a box, whose model is available in Choreonoid. The task was to pick up
Fig. 9.7 Scenario 1: snapshots of the output sequence
Fig. 9.8 Setup of scenario 2
a small box and place it on an edge of a larger box. Figure 9.7a, b show snapshots
of the simulation results using the proposed and conventional point-contact method,
respectively. The small box fell after the robot placed it on the edge when using the
proposed method, whereas it stayed on the edge when using the conventional point-
contact method. The proposed method determined an oblique restitution direction of
the surface at t = 5.0 s, which seems a more realistic behavior than that seen using
the conventional point-contact method.
The second test consisted of a cube spinning on the ground. The size and mass of
the cube were 10 × 10 × 10 cm3 and 0.5 kg, respectively. As depicted in Fig. 9.8, an
external force along the y axis and an external torque about the z axis were applied at
the center and bottom of the cube. Figure 9.9 shows the values where the computed
contact force satisfies the six-axis static friction condition with respect to P = 4, 8,
16, 32. The boundary approaches a curve that corresponds to the theoretical boundary
of the friction torque as P increases.
Fig. 9.9 Scenario 2: region

of static friction
9.3 Machine-Terrain Interaction Mechanics
This section first presents the background and motivation for research on machine-
terrain interaction mechanics. Then, a macroscopic approach based on the resistive
force theory (RFT) and its experimental validation are described. In addition, a micro-
scopic approach is explained along with its theoretical implementation. The exper-
imental validation of the proposed macro and microscopic approaches is depicted
with some representative results. Finally, a dynamics simulation demonstrates the
dual-arm construction machine developed for the ImPACT Tough Robotics Chal-
lenge.
9.3.1 Background and Motivation
Construction machines are widely used for the development and maintenance of
infrastructures such as roads, buildings, and dams. In addition, these machines can
be used for disaster response to remove rocks or soil blocking roads, and clear rubbles
and debris from collapsed buildings. Specifically, teleoperated construction machines
deployed in disaster sites can mitigate risks for human operators, who should not
directly operate the machine but can stay at a safe distance. However, teleoper-
ated construction machines are usually less efficient than those directly operated,
because the teleoperator has no access to the rich information available in-situ. In
fact, the communication rate between the machine and its operator is usually lim-
ited. Likewise, cameras mounted on the machine may not provide high-quality and
wide-angle images, and the operator cannot suitably perceive the dynamic response of
the machine operation in real time. Alternatively, autonomous construction machines
represent a promising solution for increasing operation efficiency. Technical research
and developments related to these issues have been demonstrated at the ImPACT
Tough Robotics Challenge as described in Chap. 5 of this monograph.
A high-fidelity dynamics simulator for construction machines would allow to
innovate classical construction processes and products. For instance, a real-time
simulator alongside the operation of an unmanned construction machine can help its
operator by providing the referential motion of the machine or estimating the volume
of excavated soil. In addition, an accurate simulator can improve the task scheduling
and planning of unmanned construction systems. Furthermore, the specific simu-
lation of earthwork equipment such as buckets, tracks, and blades can be used to
optimize their shape or enhance the control of work trajectories. Moreover, a vir-
tual platform based on accurate mechanics of unmanned construction machines can
accelerate the training of operators for skillfully handling the complicated controls
of bucket, arm, booms, and tracks. The abovementioned capabilities demand that
the simulator is endowed with selective and sufficient accuracy as well as computa-
tional efficiency for its successful application. A key issue for realizing a high-fidelity
simulator is machine-terrain interaction mechanics, as construction machines often
traverse rough terrain and excavate soil at ground level. Hence, the forward dynam-
ics of the whole-body motion from the construction machine should be calculated
considering the contact forces and moments generated on the earthwork equipment.
Consequently, determining the interaction mechanics between the machine and ter-
rain is essential to accurately simulate the dynamic response of the machine.
In this research, two distinct approaches regarding macro and microscopic points
of view are considered to address the dynamic behavior of construction machines and
earthwork equipment. The macroscopic approach empirically models the machine-
terrain interaction mechanics by focusing on the resistive force comprising the stress
distribution at the contact patch. The model can be validated by a soil mechanics
test bench. Complementarily, the microscopic approach exploits a discrete element
method (DEM) to precisely simulate the interaction forces and terrain deformation
[8]. Together, these approaches provide theoretical and empirical techniques that can
contribute to the development of high-fidelity dynamics simulators.
9.3.2 Macroscopic Approach: Improved RFT
Machine-terrain interaction mechanics has been thoroughly investigated in the field

of terramechanics, including the specific mechanics of construction machines and
earthwork equipment. For instance, bucket-soil interaction models have been pro-
posed in a variety of settings [11, 12, 19, 29, 30]. Recently, Li et al. [16] proposed
the RFT for estimating the contact force exerted on a primitive object intruding
into (or extracted from) granular media. In addition, the applicability of the resistive
force hypotheses regarding the rheological characteristics of granular media has been
clarified [3].
The RFT basically follows the constitutive relation: first, an object moving in
granular media is divided into several minute parallelepiped units, and the stresses
Sensor-embedded bucket
Actuator
(Vertical)
Actuator Force-torque sensor

(Rotational)
Actuator
(Horizontal)
LIDAR
Bucket
Miniature force sensor
Fig. 9.10 Bucket test setup (Left, soil mechanics test bed; Right, sensor-embedded bucket)
acting on each unit are determined. Subsequently, the resistive force is calculated by
the surface integral of the stresses acting on the objects. An open problem on the RFT
is related to terrain deformation: the RFT can solve an instantaneous force acting on
an object placed on or within soil; however, it excludes the terrain deformation due
to the object movement, and therefore the estimated forces during continuous object
motion (e.g., soil excavation by a bucket) may present a low accuracy. Although such
deformation is numerically trackable using the DEM, it may be computationally
expensive.
The key technique in the proposed macroscopic approach consists in experi-
mentally deriving a correction factor for the conventional RFT to account for soil
deformation. This improved RFT (iRFT) basically considers the intrusion angle of
an object. Specifically, the RFT measures the angle from the leveled terrain surface,
whereas the iRFT considers the soil surcharges due to its deformation and measures
the intrusion angle based on the corresponding surface. More details on the iRFT
can be found in [31, 35].
The iRFT can be validated using a soil mechanics test bench, as that shown in
Fig. 9.10, consisting of a three-DOF motion table with a sensor-embedded bucket.
The motion table can generate arbitrary trajectories resembling soil excavation by
a bucket, whereas the sensor-embedded bucket measures the normal and tangential
stresses generated at the contact patch between the bucket surface and soil. In addi-
tion, a LIDAR sensor captures the 3D soil deformation during excavation, and a
force-torque sensor mounted on the bucket wrist measures the resistive force.
Figure 9.11 shows the typical soil deformation measured by the LIDAR sensor and
the stress distribution calculated by the iRFT while the bucket excavates soil along
an arbitrary trajectory. The stress generated on the bucket surface clearly becomes
larger with the increasing soil surcharge. Figure 9.12 shows the resistive forces of
the calculated and measured data. In this test, the bucket only rotated its wrist at
an arbitrary height and stopped its rotation when the bucket front surface was per-
pendicular to the terrain surface. Then, the bucket translationally moved resembling
a bulldozing motion. From the figure, the forces calculated by the iRFT relatively
Fig. 9.11 Stress distribution

on bucket surface obtained
from calculations and
measurements of LIDAR
sensor (vertical direction)
Fig. 9.12 Resistive force 150

with correction angle κ. The iRFT
black and red lines indicate
the resistive forces in the iRFT
horizontal and vertical 100 Exp.
directions, respectively Exp.
50
-50
0 10 20 30
match with the experimental data. The root mean square errors is 12.97 N for the
vertical force and it is 6.39 N for the horizontal force. The calculation error for the
vertical force is due to the fact the force calculated by the iRFT oscillates at the
last phase of the excavation. This is deduced that the soil surcharge (accumulated in
front of the bucket) reaches its angle of repose of the soil. In such situation, small
fluctuation of the correction factor affects large deviation of the calculation result of
the vertical force. This point still remains an open issue in this model.
9.3.3 Microscopic Approach: DEM
In terramechanics studies on machine-terrain interaction, the DEM is widely used to

reproduce soil deformation or collapse with high accuracy [13, 24]. In fact, applying
the DEM to the detailed simulation of bucket excavation using construction robots,
as demonstrated in the ImPACT Tough Robotics Challenge, allows to easily evaluate
the influence of soil condition, soil movement, and geometric shape of the interacting
parts. Using the DEM for a microscopic approach has the following advantages:
• The DEM allows to evaluate in detail the machine-terrain interaction and deforma-
tion or collapse inside the soil, which are phenomena difficult to directly observe
through experiments. Therefore, the DEM can provide supporting data to formu-
late the macroscopic model and the implementation in a dynamics simulator.
• The DEM allows to acquire information complementing experiment outcomes, and
its validity can be demonstrated by comparison with the experiments. For instance,
besides typical buckets, it is possible to systematically analyze soil handling by
robot hands considering other conditions than sandy soil.
In the DEM, the time evolution of frictional contact and sliding between numer-
ous particles can be tracked. Thus, the behavior of granular media can be suitably
analyzed. In this research, the discrete element function of the commercial software
package LS-DYNA [18] was adopted, and the Voigt model was used for the contact
model among particles.
Figure 9.13 shows a sequence of an excavation simulation with a typical bucket.
During the DEM analysis, an arbitrary excavation can be generated by independently
controlling the rotation of the arm and wrist of the bucket. Figure 9.14 compares the
excavation forces obtained from the DEM analysis to experimental results during
uniaxial rotating excavation at a prescribed bucket height. The evolution of the exca-
vating forces using the DEM suitably agrees with the experimental results. In addi-
tion, the soil behavior obtained from the DEM was similar to that of the experiment,
and the estimated amount of excavated material closely agrees with the measured
amount. Hence, the proposed microscopic approach based on the DEM can support
experimental results and analyses.
Fig. 9.13 Excavation simulation using the DEM

Fig. 9.14 Excavation forces

obtained using the DEM and DEM Fx
corresponding experimental Exp. Fx
results
DEM Fz
Exp. Fz
9.3.4 Simulation of Multi-finger Hand and Construction

Machine
The proposed macro and microscopic approaches can be applied to different engi-
neering problems as they complement each other in terms of calculation accuracy and
computational burden. In fact, the macroscopic approach is computationally efficient
for real-time calculation but with limited accuracy. Therefore, this approach can be
applied for applications such as virtual training platforms or on-site task scheduling.
On the other hand, the microscopic approach demands a high computational cost
but retrieves high accuracy. Therefore, this approach is appropriate for applications
requiring the detailed analysis of physical phenomena from machine-terrain inter-
action or to evaluate the design and control of the machine and related earthwork
equipment. Moreover, the DEM analysis can consider interactions with viscous soil
and wood rubble, which are conditions usually seen in disaster sites. For instance, it
can reproduce the collapse and destruction of wood rubble by setting the appropriate
cohesive force among particles.
A typical result of using both approaches is illustrated in Fig. 9.15, which cor-
responds to an excavation analysis of a multi-finger hand [15] developed for the
ImPACT Tough Robotics Challenge. Both approaches reproduce specific aspects
of the excavation motion: the macroscopic approach provides a fast analysis for
real-time computation, whereas the microscopic approach provides a detailed exam-
ination of soil deformation and estimates the excavated soil volume.
In addition, dynamics simulation using the knowledge and findings from these
approaches has been developed. The simulator is useful to applications such as assess-
ing the maneuverability of a dual-arm construction robot, which was developed for
the ImPACT Tough Robotics Challenge. For instance, Fig. 9.16 shows a scenario
in which a machine excavates at the base of a sloped terrain using one arm while
maintaining the other arm holding a rod located at the top of the slope. Likewise,
Fig. 9.15 Excavation simulation for multi-finger hand using the macroscopic (top) and microscopic
(bottom) approaches
Fig. 9.16 Dynamics simulation of dual-arm construction machine
Fig. 9.17 Simulation-based verification of unroofing task using dual-arm construction machine.
The top images show the implemented demonstration in the real machine, and the bottom images
show the simulation counterparts
the simulator has verified the feasibility and stability of a real construction machine
during the unroofing task shown in Fig. 9.17.
Overall, this section describes contributions towards the development of a high-
fidelity dynamics simulator for construction machines based on two approaches to
resemble machine-terrain interaction mechanics. First, the macroscopic approach
considers the general stresses generated during interactions. The iRFT can be easily
implemented in the Choreonoid software framework as an user-subroutine func-
tion that calculates the machine-terrain interaction forces since the model of the
iRFT only needs geometric and kinematic information of a corresponding earth-

work equipment. Second, the microscopic approach exploits the DEM to simulate
terrain as an aggregate of granular particles and accurately determine interaction
forces and terrain deformation. Therefore, the macro and microscopic approaches
can be used to complementarily analyze problems such as bucket excavation. The
microscopic approach along with the DEM enabled the systematic examination of
soil handling by a multi-fingered hand (Fig. 9.15). Such examination would be dif-
ficult to replicate several times in real experiments. Moreover, the results from the
microscopic approach can be applied to improve the macroscopic approach using
the iRFT. Therefore, the proposed approaches can contribute in the derivation of
novel and sophisticated machine-terrain interaction mechanics and the innovation of
products and processes related to construction machines.
9.4 Graphics Engine for Simulating Natural Phenomena

and Wide-Angle Vision Sensors
Besides dynamics, Choreonoid can simulate vision sensors such as cameras and
LIDAR sensors. This is realized by internally using the 3D computer graphics ren-
dering for displaying the simulation results to generate images from the sensor view-
point. In addition, the data of the depth buffer for rendering can be used to simulate a
distance image that can be acquired by either an RGBD camera or a LIDAR sensor.
As vision sensors play an important role in the recognition and operation in disaster
sites, simulating them is essential for an effective disaster response robot simulator.
To consider additional conditions for disaster response, we developed visual effects
resembling natural phenomena and a function to simulate wide-angle vision sensors.
Regarding natural phenomena, we implemented fire, smoke, fountain water, fog,
rain, and snow effects for camera image simulation. These phenomena are commonly
found in disaster sites and considerably affect teleoperation and image-based object
recognition. Therefore, simulating them can improve the effectiveness of disaster
response robots. For the simulation, we included a particle system to implement these
phenomena in the rendering function of Choreonoid. For instance, the simulation
illustrated in Fig. 9.3 shows fire and smoke effects, thus resembling fire occurring at
a plant.
As the technology underlying wide-angle vision sensors constantly improves,
such sensors are being increasingly used in disaster response robots to simultane-
ously obtain information from the surroundings of disaster sites, thus improving
the response effectiveness. However, simulation of wide viewing angle has been
limited by the field angle that can be projected onto a single rendering screen. To
increase the viewing angle, we developed a method using plural rendering screens to
cover a wide viewing angle. Simulations of these sensors are illustrated in Fig. 9.18,
which shows image acquisition from a Ricoh Theta camera (The Ricoh Company,
Fig. 9.18 Simulation of wide-angle vision sensors. The left image shows the simulation of a Ricoh
Theta camera, and the right image shows the simulation of a Velodyne VLP-16 LIDAR sensor
capable of 360◦ measurements around the yaw axis and ±15◦ around the pitch axis
Ltd., Tokyo, Japan) capable of omnidirectional recording, and data acquired by a

VLP-16 LIDAR sensor (Velodyne LiDAR, San Jose, CA, USA) capable of simulta-
neous measurements in a range of 360◦ around the yaw axis.
9.5 World Robot Summit
At the World Robot Summit 2018 [20] to be held in Japan during October, four
categories of robot competitions will take place. The disaster robotics category is a
competition for disaster response robots, where the technologies developed in this
project will be demonstrated. In addition, the Tunnel Disaster Response and Recovery
Challenge in this category corresponds to simulations using Choreonoid.
Two main reasons motivate this simulation-based competition. First, reduce the
costs of preparing for both the competition field and the robots, thus allowing more
teams to participate. In fact, as the theme of the competition is tunnel disaster, the
environment and robots have large scales, which implies high cost and considerable
effort. Moreover, simulation allows to resemble realistic behaviors on computers
without requiring the real environment and robots but obtaining similar outcomes.
Second, increase the capabilities of the simulator and promote its use through the
competition. Therefore, the simulator can be widely used in the development of
disaster response robots, thus enabling further improvements in the application of
robots for disaster response.
The competition assumes the use of the robots developed in this project shown in
Fig. 9.19. WAREC-1 [10] is a legged robot with a symmetrical arrangement of four
limbs to operate in various configurations, such as quadruped locomotion and two-
arm/two-leg operation resembling a humanoid robot. Participants’ ingenuity should
determine the actual use of this robot according to the situation. Next, the dual-arm
construction robot [36] can be used for tasks requiring high power in tunnel disasters.
Again, participants’ ingenuity will devise the proper use of the two powerful arms.
Fig. 9.19 Standard robot platforms for the tunnel disaster response and recovery challenge
Fig. 9.20 Tasks of the tunnel disaster response and recovery challenge for world robot summit
2018
Finally, the multicopter robot can be used in cooperation with a ground robot for task
execution, up to two robots are allowed to be simultaneously employed during the
competition.
The competition comprises the six tasks (T1 to T6) illustrated in Fig. 9.20. For
these tasks, the robot can be teleoperated, endowed with autonomous operation, or
use a mixture of both. In any case, it is indispensable to equip the robot with vision
sensors, which can be simulated by the corresponding Choreonoid function. Inside
the tunnel, smoke and fire are present in some areas where visibility considerably
diminishes. In addition, communications can be undermined, especially affecting
teleoperation and possibly reducing the robot responsiveness. Therefore, operation
even under very low visibility should be considered, and an autonomous operation
mode for the robot is highly recommended in such situations.
The six tasks of the competition can be summarized as follows. T1 consists of
getting through the tunnel and traverse uneven terrain with debris caused by the
disaster. Therefore, the locomotion ability of the robot is tested during this task.
T2 requires a vehicle inspection that tests the manipulation ability of the robot to
open doors and the vision sensor operation. Then, T3 consists of rescuing victims
within the vehicle. In this scenario, the door of the vehicle is locked and damaged
due to an accident, and hence it must be opened using a hydraulic spreader. In
addition, the robot must safely and carefully take the victim (a dummy doll) out of
the vehicle. Therefore, T3 tests more advanced capabilities than the previous tasks.
In T4, the tunnel contains scattered obstacles that must be removed to allow the
transit of other vehicles. Besides work capacity, this task tests the ability to elaborate
workplan. T5 demands a fire hydrant operation by manipulating its hose, nozzle, and
valve. This task is implemented using simulation of natural phenomena including
fire, smoke, and fountain water. T6 aims to inspect vehicles surrounded by collapsed
walls. First, a wall above a vehicle should be fixed with a shoring tool. Then, a hole
should be drilled on the wall to allow inspection inside the vehicle. Both T5 and
T6 are challenging tasks that require high-level and sophisticated skills of object
recognition and manipulation.
The six tasks of the competition are challenging not only for robots but also
for simulators. Choreonoid provides all the necessary functions to implement the
tasks, including the functions developed in this project, and the verification and
improvement of these functions will continue throughout the competition. The next
World Robot Summit is scheduled to be held in 2020, and competitions with real
environments and robots will be organized. These efforts aim to make robots more
capable to help during disaster situations, such as tunnel collapse.
(JST) Agency.
References
1. Algoryx Simulation AB: AGX Dynamics. https://www.algoryx.se/products/agx-dynamics/

2. Ando, N., Suehiro, T., Kitagaki, K., Kotoku, T., Yoon, W.-K.: RT-middleware: distributed com-
ponent middleware for RT (robot technology). In: Proceedings of the IEEE/RSJ International
Conference on Intelligent Robotics and Systems (IROS), pp. 3933–3938 (2005)
3. Askari, H., Kamrin, K.: Intrusion rheology in grains and other flowable materials. Nat. Mater.
15(12), 1274–1279 (2016)
4. Baraff, D.: Linear-time dynamics using lagrange multipliers. In: Proceedings of the 23rd Annual
Conference on Computer Graphics and Interactive Techniques, pp. 137–146 (1996)
5. Bullet Real-Time Physics Simulation. https://pybullet.org/wordpress/
6. CM Labs Simulations: Vortex Studio Simulation Platform. https://www.cm-labs.com/vortex-
studio/
7. Coppelia Robotics GmbH: V-REP Virtual Robot Experimentation Platform. http://www.
coppeliarobotics.com/
8. Cundall, P.A., Strack, O.D.L.: A discrete numerical model for granular assemblies. Geotech-
nique 29(1), 47–65 (1979)
9. Hasegawa, S., Fujii, N., Akahane, K., Koike, Y., Sato, M.: Real-time rigid body simulation
for haptic interactions based on contact volume of polygonal objects. Comput. Graph. Forum
23(3), 529–538 (2004)
10. Hashimoto, K., Kimura, S., Sakai, N., Hamamoto, S., Koizumi, A., Sun, X., Matsuzawa, T.,
Teramachi, T., Yoshida, Y., Imai, A., Kumagai, K., Matsubara, T., Yamaguchi, K., Ma, G.,
Takanishi, A.: WAREC-1 - a four-limbed robot having high locomotion ability with versatility
in locomotion styles. In: Proceedings of the 15th IEEE International Symposium on Safety,
Security, and Rescue Robotics, pp. 172–178 (2017)
11. Holz, D., Azimi, A., Teichmann, M., Mercier, S.: Real-time simulation of mining and earthmov-
ing operations: a level set-based model for tool-induced terrain deformations. In Proceedings
of the International Symposium on Automation and Robotics in Construction and Mining
(ISARC), p. 1 (2013)
12. Holz, D., Azimi, A., Teichmann, M.: Advances in physically-based modeling of deformable soil
for real-time operator training simulators. In: Proceedings of the IEEE International Conference
on Virtual Reality and Visualization (ICVRV), pp. 166–172 (2015)
13. Johnson, J., Kulchitsky, A., Duvoy, P., Iagnemma, K., Senatore, C., Arvidson, R., Moore,
J.: Discrete element method simulations of Mars Exploration Rover wheel performance. J.
Terramechanics 62, 31–40 (2015)
14. Kokkevis, E.: Practical physics for articulated characters. In: Proceedings of Game Developers
Conference, pp. 1–16 (2004)
15. Kusakabe, Y., Ide, T., Hirota, Y., Nabae, H., Suzumori, K.: Development of high performance
hydraulic actuators and their application to tough robot hand. In Proceedings of JSME Con-
ference on Robotics and Mechatronics, 1P1-09b6 (2016)
16. Li, C., Zhang, T., Goldman, D.I.: A terradynamics of legged locomotion on granular media.
Science 339(6126), 1408–1412 (2013)
17. Lötstedt, P.: Numerical simulation of time-dependent contact and friction problems in rigid
body mechanics. SIAM J. Sci. Stat. Comput. 5(2), 370–393 (1984)
18. LSCT, LS-DYNA User’s Manual, (2018)
19. Luengo, O., Singh, S., Cannon, H.: Modeling and identification of soil-tool interaction in
automated excavation. In Proceedings of the 1998 IEEE/RSJ International Conference on
Intelligent Robots and Systems (IROS), pp. 1900–1906 (1998)
20. METI and NEDO: World Robot Summit. http://worldrobotsummit.org/
21. Nakaoka, S.: Choreonoid: extensible virtual robot environment built on an integrated GUI
framework. In: Proceedings of the 2012 IEEE/SICE International Symposium on System Inte-
gration (SII2012), pp. 79–85 (2012)
22. Nakaoka, S., Hattori, S., Kanehiro, F., Kajita, S., Hirukawa, H.: Constraint-based dynam-
ics simulator for humanoid robots with shock absorbing mechanisms. In: Proceedings of the
(2007)
23. Nakaoka, S., Morisawa, M., Cisneros, R., Sakaguchi, T., Kajita, S., Kaneko, K., Kanehiro,
F.: Task sequencer integrated into a teleoperation interface for biped humanoid robots. In:
Proceedings of the IEEE-RAS International Conference on Humanoid Robots, pp. 895–900
(2015)
24. Nakashima, H., Fujii, H., Oida, A., Momozu, M., Kanamori, H., Aoki, S., Yokoyama, T.,
Shimizu, H., Miyasaka, J., Ohdoi, K.: Discrete element method analysis of single wheel per-
formance for a small lunar rover on sloped terrain. J. Terramechanics 47(5), 307–321 (2010)
25. NVIDIA Corporation: PhysX SDK. https://developer.nvidia.com/physx-sdk
26. Open Dynamics Engine. http://www.ode.org/
27. Open Source Robotics Foundation: Gazebo. http://gazebosim.org/
28. Open Source Robotics Foundation: ROS Robot Operating System. http://ros.org/
29. Reece, A.R.: The fundamental equation of earth-moving mechanics. Proc. Inst. Mech. Eng.
179(6), 16–22 (1964)
30. Takahashi, H., Minakami, K., Saito, Y.: Analysis on the resistive forces acting on the bucket
of power shovel in the excavating task of crushed rocks. J. Appl. Mech. 339, 603–612 (2003)
31. Tsuchiya, K., Ishigami, G.: Experimental analysis of bucket-soil interaction mechanics using
sensor-embedded bucket test apparatus. In: Proceedings of the Asia-Pacific Conference of
the International Society for Terrain-Vehicle Systems (ISTVS) for Terrain-Vehicle Systems
(ISTVS) (2018)
32. Wakisaka, N., Sugihara, T.: Fast and reasonable contact force computation in forward dynam-
ics based on momentum-level penetration compensation. In: Proceedings of the IEEE/RSJ
International Conference on Intelligent Robots and Systems (IROS), pp. 2434–2439 (2014)
33. Wakisaka, N., Sugihara, T.: Loosely-constrained volumetric contact force computation for
rigid body simulation. In: Proceedings of the IEEE/RSJ International Conference on Intelligent
Robots and Systems (IROS), pp. 6428–6433 (2017)
34. Yamane, K., Nakamura, Y.: Stable penalty-based model of frictional contacts. In: Proceedings
of the IEEE International Conference on Robotics and Automation (ICRA), pp. 1904–1909
(2006)
35. Yoneyama, R., Omura, T., Ishigami, G.: Modeling of bucket-soil interaction mechanics based on
improved resistive force theory. In: Proceedings of the European-African REgional Conference
of the International Society for Terrain-Vehicle Systems (ISTVS) (2017)
36. Yoshinada, H.: A dual-arm construction robot in ImPACT tough robotics challenge program.
J. Robot. Soc. Jpn. 33(10), 711–715 (2017)
Part V
Evaluation and Human Factors
Chapter 10
Field Evaluation and Safety Management
of ImPACT Tough Robotics Challenge
Tetsuya Kimura, Toshi Takamori, Raymond Sheh, Yoshio Murao, Hiroki

Igarashi, Yudai Hasumi, Toshiro Houshi and Satoshi Tadokoro
Abstract This chapter describes the development and the safety management of the
facilities used in the field evaluation of the ImPACT Tough Robotics Challenge (TRC)
in order to accelerate the technology and market innovation. Several evaluation fields
have been developed in the TRC, e.g., a plant mock-up, a UAV evaluation facility, and
a rubble field. In Sect. 10.1 (corresponding author: T. Takamori), the development
of the rubble field is described. The other evaluation fields are developed almost
in the same manner, and the experience and knowledge of the developments will
be carried through to the development of Fukushima RTF, which will be opened
in 2020 as one of the largest field evaluation facilities for the response robots in
the world. In Sect. 10.2 (corresponding author: T. Kimura), the safety management
associated with the rubble field is explained based on international safety standards.
Two safety principles, the separation principle and the stop principle, are mainly used
for risk reduction. In Sect. 10.3 (corresponding author: R. Sheh), the application of
the Standard Test Method (STM) for response robots performance to the TRC is
discussed for performing quantitative assessment of the robot performance. Related
STMs for the TRC are introduced and Visual Acuity is identified as the most broadly
relevant to all robots in the TRC. The new Automated Visual Acuity test method is
introduced and described here. Each topic is written by the corresponding authors
individually.
T. Kimura (B) · H. Igarashi · Y. Hasumi · T. Houshi

Nagaoka University of Technology, Nagaoka, Japan
e-mail: kimura@mech.nagaokaut.ac.jp
T. Takamori · Y. Murao
International Rescue System Institute, Kobe, Japan
e-mail: takamori@rescuesystem.org
R. Sheh
Curtin University, Bentley, Australia
e-mail: Raymond.Sheh@curtin.edu.au
S. Tadokoro

https://doi.org/10.1007/978-3-030-05321-5_10
482 T. Kimura et al.
Table 10.1 Technology TRL Definitions

Readiness Level (TRL) in
TRC [17] 9 Actual system deployed and used in often
8 Actual system tested and qualified in real
disaster environment
7 Prototype system demonstration in real disaster
environment
6 Prototype system demonstration in simulated
disaster environment
5 Component validation in simulated disaster
environment
4 Component validation in laboratory environment
3 Analytical and experimental critical function
2 Technology concept and application formulated
1 Basic principles observed and reported
10.1 Development of the Field Evaluation Facilities
In the Impulsing Paradigm Change through Disruptive Technologies Program

(ImPACT) Tough Robotics Challenge(TRC) under the cabinet office of the gov-
ernment of Japan, several “Tough” robots have been developed and their “Tough”
field evaluations have been carried out twice a year in order to accelerate R&D and
their social implementation [10]. These evaluations can be used as a showcase as fol-
lows: For disaster responders, the evaluations could support their procurement and
deployment decisions. For robot manufacturers and service providers, they could
inspire new businesses. For R&D members in the TRC, they could help to clarify
Technology Readiness Level(TRL) [15], where TRL in the TRC is defined as in
Table 10.1 [17].
For this purpose, a plant mock-up, a UAV evaluation facility, and a rubble field
have been developed in Tohoku University from 2015 to 2017, and the corresponding
facilities have been provisionally established in Fukushima Robot Test Fields (RTF)
in 2018. See Fig. 10.1. During the TRC, the field evaluation have been carried out
seven times from 2015 to 2018 (Table 10.2). In addition, the scenarios of the field
evaluation have been developed by considering the purpose of the TRC. The catalogs
of the developed robot technology in the TRC and the questionnaires have been used
in each evaluation as a communication tool between developers and visiting observers
(potential users).
In this section, the development of the rubble field is described. The other evalua-
tion fields are developed almost in the same manner. The experience and knowledge
of the developments will be carried over to the development of the Fukushima RTF,
which will be opened in 2020 as one of the largest field evaluation facilities of
response robots in the world [6].
10 Field Evaluation and Safety Management of ImPACT … 483
(b) UAV evaluation facilities (A:wind

(a) plant mock-up tunnel(wind speed:10m/s), B:water
pool, C:rain tunnel (rainfall level:100-
300mm/hour))
(c) rubble field
(d) overview of the field in RTF (e) plant mock-up in RTF
Fig. 10.1 Evaluation fields in the TRC: a–c in Tohoku University, d and e in Fukushima Robot
Test Field (RTF)
Table 10.2 Schedule of the TRC field evaluations

No. Date Place
1 November 14th, 2015 Tohoku University
2 June 1st and 2nd, 2016 Tohoku University
3 November 11th and 12th, 2016 Tohoku University
4 June 19th and 20th, 2017 Tohoku University
5 November 10th and 11th, 2017 Tohoku University
6 June 14th, 2018 Fukushima Robot Test Field (provisional)
7 November 2nd, 2018 Fukushima Robot Test Field (provisional)
10.1.1 Development of Rubble Field
In accordance with the objectives of the TRC, the objectives of the field evaluation
are set as follows:
• Technical objective: Provide a clear measurement of robot performance for tech-
nology development.
• Industrial objective: Provide a standardized measurement of robot performance
including safety for business promotion.
• Social objective: Provide a standardized measurement of robot performance for
disaster mitigation in a safe and secure manner.
By considering these objectives, the followings are taken into account for the field
development:
• The field is suitable for the robot performance and can test the “toughness” of the
robots.
• The relationship between the field and the corresponding real situation is easy to
understand for potential robot users, which is useful for the social implementation
of the robots.
• The tests in the field are safe and easy to conduct.
According to this, the rubble field is developed as shown in the following subsec-
tion.
10.1.2 Requirements of Rubble Field
The rubble field is assumed to be composed of collapsed building parts and concrete
rubble parts(associated with debris flow), and to be used for
• victim finding,
• victim location identification,
• victim status recognition,

• victim rescue.
The robots using the fields are assumed to be
• a snake robot for in-rubble search [1],
• a construction machine robot for rubble removal [18],
• a rescue dog with robotics devices for general search [7].
Then, the requirements for the rubble field can be summarized as follows:
1. Concrete rubble parts have enough gaps/voids for search by robots.
2. Collapsed building part has a frame structure where a simulated victim can be
placed inside of the building and the configuration of the interior of the building
can be changed.
3. The field apparatuses are stable.
4. Some gaps/voids of rubble provide sufficient space for rubble removal by robots.
10.1.2.1 Design and Manufacturing of Rubble Field
Collapsed Building Part:

The collapsed building part is composed of the frame structure simulating a collapsed
building and hume pipes. See Fig. 10.2. The frame was manufactured in a factory
before installation into the field. It has a four-story structure and there is a small hole
at the top that is used as an access hole for rescue activity evaluation. In the frame,
pieces of wood and fallen furniture are placed to simulate the inside of a partially
collapsed building just after an earthquake. See Fig. 10.3.
At the bottom of the frame structure, two hume pipes are installed; a pipe of
one meter diameter is used for adjusting the simulated rubble in the frame and the
simulated victim, and another pipe is used for searching from the bottom of the
debris. According to the strength of the basic components of the frame and the pipes,
and the load weight estimated from volume of the debris, the strength of the frame
and the pipes is sufficient for safety.
Concrete Rubble Part:
The size and shape of the concrete debris in the rubble field are determined by
referring to existing rubble fields used for responder training, e.g., Disaster City in
the US [4]. The debris is concrete and parts of real buildings that were collapsed by
the Tsunami of the Great East Japan Earthquake in 2011. Note that such real debris
is considered industrial waste, so we had to follow the associated regulations during
the field construction. In addition, contaminants on the debris, e.g. asbestos, had been
examined before construction.
The shapes of such real pieces of debris are complex and not easy to specify in
off-site design, so on-site design (observe the debris shape on-site and specify the
position of the pieces of debris) is needed. In piling up the debris, the field designers
Fig. 10.2 Collapsed

building part of rubble field
Fig. 10.3 Example of rubble

in the building frame
had been on-site in order to check the position and stability of the debris. The pile of
concrete debris was placed on a sheet for clear removal at the end of the TRC project,
which is a requirement of the industrial waste regulation, and for the prevention of
ground sinking.
When piling up the debris, a time-lapse video was recorded, which allowed us to
verify the inside structure of the concrete rubble parts. Figure 10.4 shows an image
of the video. This will be helpful for path planning inside of the rubble pile and
checking its stability.
10.1.2.2 Developed Rubble Field
Figure 10.5 shows the overview of the developed rubble field. The field has been
used in several evaluations of snake robots and rescue dogs with robotics devices
(a) bottom (b) first layer
(c) second layer (d) top layer
Fig. 10.4 Images of time-lapse video of concrete rubble part construction
Fig. 10.5 Developed rubble

field
and no serious problems were reported. This implies that requirements 1 to 3 are
satisfied. Because there is no test of concrete debris removal so far, requirement 4
has not yet been examined.
10.2 Safety Management
Safety management for the rubble field is explained here, where international safety
standards are used in order to consider several parameters of risk management in a
systematic manner.
10.2.1 Safety Management of Rubble Field
The robots in the TRC are designed and used for research purposes, so safety man-
agement measures of the robots themselves in technical manners cannot be the same
as those employed for commercial off the shelf robots. To compensate for this,
more safety management by personal attention of the staff is required by consider-
ing the individual research situation(e.g., in [8]). Such ideas of safety management
for research purposes can be seen in the Machine Directive of the EU [5], where
“machinery specially designed and constructed for research purposes for temporary
use in laboratories” is excluded from the safety regulations described. Therefore,
safety management in the TRC is carried out based on the following two safety
principles:
• separation principle: Hazard (robot) and human are separated via space and/or
time to prevent physical contact. An example of the corresponding safety measure
is the use of a physical guard to maintain distance.
• stop principle: Humans can touch hazard (robot) only if the stop (no energy)
state is confirmed. An example of the corresponding safety measure is the use of
a rotating warning light to indicate robot active status.
Based on these two principles, safety management of the rubble field is developed
based on risk assessment and risk reduction in ISO12100 [11]. Note that these two
principles can be seen in the safety of machinery in ISO standards [11].
10.2.1.1 Risk Assessment
For risk assessment, a hybrid method in [9] is used, because the risk parameters
in the method directly correspond to the “three step method” of risk reduction in
ISO12100. See Table 10.3. The level of risk evaluation is set as in Table 10.4.
A part of the initial risk assessment of the rubble field is shown in Table 10.5.
The intended persons in the assessment are assumed to be the robot operator, test
staff, and an outsider (third person). According to the assessment, being crushed by
rubble (No. 4 and 5 in the table) is the most severe risk (value of risk R = 32 and
24) , and lightning (No. 3, R = 24) is also severe. According to the risk evaluation
in Table 10.4, these risks must be reduced.
Table 10.3 Risk parameters

Pt. Severity of harm : S Exposure Probability of Possibility of
frequency : F hazardous events : P avoidance : A
4 Serious harm (death Continuous/Always High (N/A)
or sequelae)
3 Long-term care Frequently/Long time Possible Unavoidable
(inpatient)
2 Short-term care Sometimes/Short time Low (N/A)
(outpatient visit)
1 Allowable Rare/Instantaneous Rare Conditioned
Table 10.4 Risk evaluation in assessment

Value of risk estimation: Risk evaluation Necessity of risk reduction
R = S × (F + P + A)
18 or over Not tolerable Required
8–17 Protective measure required. Necessary. When technical
However, it is conditionally measures are difficult, warning
acceptable. if there are no other indications and administrative
measures or it is not practical measures are necessary
(ALARP)
7 or under Tolerable Not necessary
10.2.1.2 Risk Reduction
The risk reduction is carried out based on the three-step method in ISO12100 [11]
by some of the authors. Table 10.6 shows a part of the risk re-assessment after the
risk reduction. The details are as follows:
• In No. 1, for the risk of a collision with falling rubble, by using the protective
measure of maintaining a safety distance, the frequency F is reduced from 2 to
1 and the avoidance A is reduced from 3 to 1, which reduces the value of risk R
from 14 to 8. The safety distance is assumed to be the height of concrete rubble
(2 m) here.
• In No. 2, for the risk of being cut by a steel frame, the protection with soft materials
reduces F:2 → 1, and R:14 → 12.
• In No. 3, for the risk of lightning, the following two protective measures are used:
– Using real-time weather forecast reduces A:3 → 1.
– Preparing evacuation area reduces F:2 → 1.
By combining the two measures, R:24 → 12.
• In No. 4, for the risk of being crushed by rubble for the robot operator on the
rubble, the following three protective measures are used:
490
Table 10.5 A part of risk assessment of the rubble field (symbols: S(Severity of Harm), F(Frequency), P(Probability), A(Avoidance), Ph(Probability of
Harm = F + P + A), R(Risk =S × Ph) )
Hazard identification Risk estimation
Working No. Hazard Hazardous situation / Hazardous Assumed Intended S Ph R
phase event harm person
F P A
Meeting 1 Physical energy Work around rubble/Rubble rolls Fracture Staff 2 7 2 2 3 14
on site and collides with workers
2 Sharp Edge Work on rubble/Cutting with Incision Staff 2 7 2 2 3 14
rubble steel frame
3 Lightning Work outdoors/Be struck by Electric Staff 4 6 2 1 3 24
lightning shock
Test 4 Falling object Robot exploration work in rubble/ Crush Operator 4 8 2 3 3 32
Body sandwiched between rubble
End test / 5 Falling object Enter into the rubble field/Body Crush Outsider 4 7 2 2 3 28
clearance sandwiched between rubble
T. Kimura et al.
Table 10.6 Risk re-assessment (symbols: S(Severity of Harm), F(Frequency), P(Probability),

A(Avoidance), Ph(Probability of Harm = F + P + A), R(Risk = S × Ph), R ∗ :risk by combina-
tion of protective measures indicated by *)
Re-risk estimation
No. Protective measure Severity of Probability of harm Ph Value of Value of
harm S risk R risk R ∗
Freq. Prob. Avoid. A
F P
1 Keep safety distance 2 4 1 2 1 8 8
2 Protection to sharp edge 2 6 1 2 3 12 12
3 Use weather forecast and 4* 4 2 1* 1* 16 12
Lightning strike application
Prepare evacuation area 4 5 1* 1 3 20
4 Use guide rails when robot is 4* 7 2 2* 3 28 16
inserted
Reduce working time on 4 7 1* 3 3 28
rubble
Get information on 4 6 2 3 1* 24
emergency earthquake
bulletin/Monitor deviation of
rubble
5 Partition rubble field 4 4 1 2 1 16 16
– By using a guide rail for robot insertion to the rubble, the operator can keep a
distance, which reduces P:3 → 2.
– Reducing working time on the rubble, F:2 → 1.
– Using an earthquake early warning system (radio) and monitoring the position
of the fragments of rubble with a match mark (see Fig. 10.6), A: 3 → 1.
By combining these three measures, R: 32 → 16.
• In No. 5, for the risk of being crushed by rubble for outsiders (third person) at the
end of the test for clearance activities, introducing a physical guard reduces F:2
→ 1, A:3 → 1, and R: 28 → 16.
For the validation of the risk reduction and risk re-assessment, the following
safety experts contributed without any major objection: a construction safety expert
who provides on-site safety experiences, an occupational safety expert who provides
occupational safety knowledge based on international safety standards, and a system
safety [3] expert who verifies the safety concept, where system safety is an application
of systems engineering [15].
10.2.1.3 Discussions
The safety management here is based on the international safety standards on “safety
of machinery”, where the machinery is mostly used in production situations with
trained staff for performing routine work. In contrast, the TRC field evaluation is
carried out for research purposes, so the evaluation activities are mostly temporary
Fig. 10.6 Match mark of

rubble
and the staff and the participants are diverse (academia from universities and research
organizations, engineers in companies, students, and general audiences). Therefore,
the differences between “safety of machinery” and the TRC field evaluation should
be taken into account in the safety management of the TRC. The differences can be
characterized follows:
Issues particular to the Field itself (gap to participants)
• Field evaluation in the TRC simulates disaster response activities, which involve
unusual conditions for the general public. This means that the environment of the
TRC contains potential severe accidents (e.g., falling down from a high position,
falling off a temporary structure, and landslides.)
• Disaster sites are different from each other, thus the field conditions of the TRC
would differ between each trial.
• Most of participants of the TRC are researchers and students who are working in
their laboratories for robot development. This implies
– Less/limited experiences and physical capacity in field work,
– Less/limited knowledge of safety management in field work (not only the knowl-
edge of disaster response activities, but also construction sites),
– Focusing on their own research objects degrades attention to their surroundings,
– Many of the field’s users are students. After they graduate, they are replaced by
new students; thus the continuity in know-how among them is limited.
Issues particular to Field Management
• The facilities in the field are new to the participants. They might be different from
the participant’s own facilities (e.g., something is missing), which could confuse
the participants.
• The facilities cannot be modified without permission. This might delay the imple-
mentation of protective measures.
• When one group modifies a facility, another group will be confused if they are not
informed immediately about the modification.
Issues particular to Command Structure
• Several command structures (one team, multiple teams, and all groups) are used
in each situation.
• Command could be provided from another organization, where no direct command
chain exists beforehand.
• Some teams are working in the same place and cannot understand the other team’s
activities beforehand.
So far, concrete debris removal via a construction machine robot has not been
examined in the field evaluation, where the debris condition varies in real-time on
site. Safety management of this evaluation based on the two safety principles is an
issue for further work.
10.2.1.4 Lessons Learned
• The shapes of the pieces of concrete debris are diverse so they cannot be specified
in detail before construction. Thus, the designer of the rubble field worked together
with a construction worker on-site for detailed instruction.
• In concrete debris installation, the sizes of the pieces of debris are roughly specified
before the installation. This makes the instructions from designer to the construc-
tion worker easy.
• The concrete debris contains reinforcing steel, this makes the debris strong and
stabilizes the rubble pile.
• According to the risk assessment, it is confirmed that contact with rubble leads to
several serious risks.
• Based on the separation principle, the risk parameter frequency F of the exposure
to the rubble can be reduced.
• Safety distance from the rubble pile is set to be the height of the rubble.
• Ropes are useful for indicating the safety distance around the field.
• Match marks are used to monitor the rubble movement. By using the marks, it was
observed that a Level 4 earthquake moved one piece of rubble by 5 mm.
• Many safety measures depend on the activities of individual persons. Therefore,
the workload of a person, especially safety staff member, should be considered to
maintain the person’s condition properly.
10.3 Standard Performance Test Method
Standard Test Methods (STM) are a vital tool for communicating operational require-
ments and implementation capabilities, particularly as they relate to response robots.
ImPACT-TRC worked with the US Department of Homeland Security (DHS) -
US National Institute of Standards and Technology (NIST) - ASTM International

(formerly American Society for Testing and Materials) Standard Test Methods for
Response Robots project [14] to adopt and adapt existing standardized and proto-
typical test methods and to develop new test methods relevant to the TRC goals.
As part of this process, a new Standard Test Method has been developed for
Automated Visual Acuity. This metric represents the ability of the robot to observe
fine detail in the environment and can be directly related to the ability to perform
vital tasks such as recognizing hazardous materials labels, reading license plates,
finding cracks in structures or reading gauges in industrial plants. However, unlike
testing with actual labels, license plates, cracks or gauges, the standard test method is
repeatable, reproducible, and can be widely deployed and automatically measured.
This allows it to be used to measure and compare a wide variety of implementations,
under a wide variety of conditions, including tough conditions.
10.3.1 Introduction to Standard Test Methods
Standard Test Methods consist of elemental tasks, performed in specifically built

apparatuses according to a specific procedure that yields a metric. This metric is a
repeatable, reproducible measure of a specific capability of the robot. Several test
methods, and their corresponding metrics, taken together, can be used to define the
requirements of a particular application or the capabilities of a particular robot. The
standard test methods are analogous to the basic tests of athletic ability that coaches
of sports teams use to evaluate new and existing players and to guide training. Tests
such as those for running, jumping, catching and throwing are used to decide on
new players to join a team and to determine which players may require additional
training and practice in a particular area. Different tests are weighted differently
depending on the role of the player in the team. The standardization of the method
by which these tests are performed guide the training of prospective new players so
they know what to expect. Developing standard test methods requires broad con-
sensus from the various stakeholders - particularly end users (generally emergency
responders in this application), manufacturers and developers, standards organiza-
tions and the research community. The development process starts with consultation
sessions where these stakeholders come together to discuss the present and antic-
ipated operational requirements of the application and the capabilities of existing
and proposed implementations. This includes reviewing examples of deployments,
participating in responder training events, and evaluating existing implementations.
It is important that these discussions capture not only the way in which the end uses
perform their tasks now, but also how they may do so with future capabilities. This
ensures that the resulting test methods are able to capture both current and future
requirements and capabilities. It also reduces the probability that the standard test
methods artificially constrain potential future solutions to application challenges.
This information is used to decompose the operational requirements into several
elemental tasks that can be evaluated through elemental tests in standardizable appa-
ratuses. Through this process, existing standard test methods that may have relevance
are also identified and, where necessary, adopted or adapted.
These prototypical standard test methods are deployed at responder training
events. Robots are run through both the standard tests as well as through opera-
tional scenarios. These deployments allow for the refinement of the apparatuses and
procedures by which the tests are performed. They also determine and validate the
relevance of the test methods to the operational scenarios. This ensures that differ-
ences in performance between different robots on the test reflect differences in real
world performance for that particular aspect of the robot’s capability.
As the apparatus design and procedure develop, testing is performed with a wide
variety of robotic implementations, both in deployment as well as those in devel-
opment. This ensures that the tests are appropriately capture and separate both very
low and very high levels of performance without saturating. This process also helps
to determine the repeatability, reproducibility and uncertainty of the metric.
Finally, the test method apparatus design, procedure, metric and other data is pub-
lished and standardized. The DHS-NIST-ASTM International Standard Test Meth-
ods for Response Robots project generally publishes these standards via the ASTM
International Subcommittee E54.09 on Homeland Security Applications: Response
Robots as well as through partner standards organizations such as National Fire
Protection Association (NFPA) Committee 2400 on Standards for Small Unmanned
Aircraft Systems (sUAS) used for Public Safety Operations.
This process is iterative to ensure ongoing relevance. End user groups are con-
tinuously consulted through responder visits, hosting of testing events, and training
exercises. Similarly industrial development and academic research communities are
regularly consulted via participation in industry and academic conventions and con-
ferences and the hosting of evaluation events, academic competitions and student
summer schools [16].
This process has thus far yielded 18 standardized and a further 46 prototypical test
methods [2] and have been used in the procurement of over $70 million US dollars
in robotic equipment across the US Government between 2011 and 2018.
10.3.2 Standard Test Methods for the TRC
The ImPACT-TRC project aims to develop robots that can perform under tough
conditions. In evaluating the success of such robots, it is important to determine
how they perform under both normal and tough conditions. Standard Test Methods
provide repeatable, reproducible, statistically significant metrics of performance that
enable performance to be measured under laboratory, normal and tough conditions.
As a result, they allow the ability of robots to perform under tough conditions to be
measured.
Several of the DHS-NIST-ASTM International Standard Test Methods for

Response Robots were identified as being relevant to the ImPACT-TRC project.
These include:
• Standard Terminology (ASTM E2521-16), necessary to ensure interoperability

with existing standard test methods.
• Logistics: System Configuration (ASTM WK55681), to accurately record the iden-
tity and features of the different system configurations, vital for reliably tracking
the progress of research programs such as ImPACT-TRC.
• Radio Communication: Line-of-Sight Range (ASTM E2854-12) and Non-Line-of-
Sight Range (ASTM E2855-12). Radio communications form a vital component of
many robots, be it as a primary link or as a backup to a tether. This is particularly
the case for free-flying Aerial robots, Legged, Compound and Animal Cyborg
robots. These test methods evaluate the end-to-end system performance as the
radio communications degrade due to distance and occlusion, and may be adapted
to the ImPACT-TRC domains.
• Sensing: Visual Acuity (ASTM E2566-17). The ability to perceive the environment
through visual means, such as cameras, is vital to many robots. The environments
and domains of the ImPACT-TRC pose particular challenges to visual perception.
• Sensing: Thermal Image Acuity (ASTM WK57967). Thermal imagers play a vital
role in many tough robotics applications due to their ability to see through smoke
and dust, and to tell apart a wide variety of targets and dangers that are indistin-
guishable by visible light cameras.
• Sensing: Video Latency (ASTM WK49478). Video latency, being the delay
between an event occurring in the environment and the robot conveying knowledge
of the event to a remote operator, is crucial for teleoperated robots operating in the
dynamic environments of the ImPACT-TRC, especially when radio communica-
tions and processing is required.
• Sensing: Visual Dynamic Range (ASTM WK42364). The ability to observe both
bright and dim objects in the one scene, is vital in many outdoor environments
typified by the ImPACT-TRC domains.
• Sensing: Audio Speech Intelligibility (ASTM WK60783). Audio intelligibility
under a variety of circumstances is important in robots for finding victims, as
exemplified by the Serpentine robots already demonstrated in ImPACT-TRC.
• Sensing: Point and Zoom Cameras (ASTM WK33261). Efficiently and effectively
directing perception is vital for both maintaining situational awareness for remotely
operated robots as well as for surveying and inspecting tasks.
• Dexterity: Inspect (ASTM WK54271), Inspect Underbody (ASTM WK54287)
and Inspect Interior of Constrained Space (ASTM WK54289). Many of the
domains of the ImPACT-TRC focus on robots inspecting objects and environ-
ments, particularly in confined and constrained spaces. This is particularly relevant
to Serpentine, Legged and Compound robots.
• Dexterity: Rotate (ASTM WK54273), Extract and Place (ASTM WK54274) and
Touch or Aim (ASTM WK54272). Robots that interact with the environment need
to do so with accuracy and reliability. These standard test methods may be adopted
and adapted to the unique challenges posed by the applications of ImPACT-TRC.
• Mobility: Confined Area Obstacles: Gaps (ASTM E2801-11), Hurdles (ASTM
E2802-11), Inclined Planes (ASTM E2803-11) and Stairs/Landings (ASTM
E2804-11).
• Mobility: Confined Area Terrains: Continuous Pitch/Roll Ramps (ASTM E2826-
11), Crossing Pitch/Roll Ramps (ASTM E2827-11) and Symmetric Stepfields
(ASTM E2828-11).
• Mobility: Maneuvering Tasks: Sustained Speed (ASTM E2829-11)
• Mobility: Traverse Pitch/Roll Rail Obstacles (ASTM WK54402)
• Maneuvering: Traverse Angled Curbs (ASTM WK54291), Align Edges (ASTM
WK53649) and Hallway Labyrinths with Complex Terrain (ASTM WK27852).
• Mobility: Traverse Gravel Terrain (ASTM WK35213), Traverse Sand Terrain
(ASTM WK35214) and Traverse Mud (ASTM WK54403).
– Evaluating the ability of robots to traverse terrain is complicated by the wide
variety of terrain and correspondingly wide variety of capabilities. Priorities
among these standard and prototypical test methods will be identified and where
necessary, interpreted and adapted for the domains of ImPACT-TRC.
• Mobility: Traverse Vertical Insertion/Retrieval Stack with Drops (ASTM
WK41553)
– ImPACT-TRC has a particular focus area on Serpentine, Compound and Animal
Cyborg robots with a focus on confined area mobility.
• Endurance (ASTM WK55025)
– Robot endurance is particularly important for robots deployed in tough envi-
ronments but the way in which it is measured has to be relevant and comparable
depending on the task. These principles may be adapted to the ImPACT-TRC
domain and robots or used as the basis for new, specific standards.
• Aerial Robot Maneuvering: Maintain Position and Orientation (ASTM WK58931),
Orbit a Point (ASTM WK58932), Avoid Static Obstacles (ASTM WK58933), Pass
Through Openings (ASTM WK58934) and Land Accurately (Vertical) (ASTM
WK58935).
• Aerial Robot Situational Awareness: Identify Objects (Point and Zoom Cameras)
(ASTM WK58936), Inspect Static Objects (ASTM WK58937) and Map Wide
Areas (Stitched Images) (ASTM WK58938).
• Aerial Robot Energy/Power: Endurance Range and Duration (ASTM WK58939)
and Endurance Dwell Time (ASTM WK58940).
• Aerial Robot Radio Communication Range : Line of Sight (ASTM WK58942)
and Non Line of Sight (ASTM WK58941).
• Aerial Robot Safety: Lights and Sounds (ASTM WK58943)
• Aerial Robot Situational Awareness: Inspect Wires (ASTM WK60210)
– Aerial robots are one important segment of ImPACT-TRC. Curtin has an exist-
ing STM project to develop standard test methods for tethered aerial robots in
support of NFPA 2400 and ASTM E54.09. The complete overlap of the require-
ments of the existing project with the aerial domain of the ImPACT-TRC project
will allow this project to leverage these existing resources.
Out of these, Visual Acuity was identified as the most broadly relevant due to the
fact that visual sensing plays a vital component in the performance of almost all of the
robots in the ImPACT-TRC project. Furthermore, the visual sensing capabilities of
robots tends to be affected the most due to tough conditions. For example, variations
in lighting, smoke, dust, water and wind all affect the ability of robots to visually
survey their environment and inspect objects of interest.
During the ImPACT-TRC project, therefore, a new, Standard Test Method for
Automated Visual Acuity Evaluation was developed. This involved the following
activities, consistent with the test method development process outlined above.
• Deciding on the suitable optotype for automated visual acuity measurement.
• Developing the procedure to calibrate the optotype, yielding a measurement com-
patible with existing standards.
• Designing posters that combine both the optotypes as well as the calibration arte-
facts required to effectively use this test method.
• Devising the methodology for performing the different variations of the test
method, under different environmental and robot conditions, at the desired level
of statistical significance.
• Determining the applicability of the test method through testing with a wide variety
of robotic implementations and applications.
• Documenting the procedure, results and applicability of the test method for stan-
dardization.
10.3.2.1 Optotype
Visual Acuity is a measurement of the ability of the robot to see objects of different
sizes. This metric represents the minimum separation of alternating black and white
parallel lines that can be observed by the robot’s camera system. It may be defined as
an angular separation between the lines as measured at the camera, in degrees. This
is similar to how human vision is reported, with 6/6 (20/20) vision corresponding
to one arc-minute (1/60◦ ). Once reported as such, this metric is distance invariant.
Alternatively, in situations where the distance to the chart is not known, such as
during an embedded operational task, it may be defined as the linear separation,
in millimeters, of the lines. It should be noted that the distance invariant angular
acuity can be converted to a linear acuity given an observation distance through the
formula linear _acuit y = char t_distance × tan(angular _acuit y). Conversely, a
requirement for a given linear acuity can be converted to an angular acuity at a given
distance through the formula angular _acuit y = tan −1 ( obserlinear _acuit y
vation d istance
).
Tests for visual acuity involve having the robot observe a test artefact, known as
an optotype, consisting of features of different sizes and determining the smallest
feature that can be observed under the test conditions. A pure test of visual acuity
of camera systems is the ISO12233 [12] test, which makes use of groups of lines
called resolution wedges, that start widely spaced and become narrower as shown
in Fig. 10.7. The spacing of the lines at the point at which all the lines can be barely
observed represents the visual acuity of the camera system under test, under the
conditions of observation.
This manual process is acceptable for formal testing of robots and camera sys-
tems under ideal conditions. However, for the ImPACT-TRC it will be necessary to
perform tests many times, under a wide variety of conditions. Thus an automated
system to perform the Visual Acuity measurement is highly desirable. To this end,
an automated Visual Acuity test has been developed, using the QR Code symbol
[13]. These symbols are easy to decode automatically using software that can ana-
lyze video recorded from the robot’s cameras. QR Codes consist of a grid of black
and white squares that encode information according to a certain protocol. For the
purpose of this test method, the QR Codes consist of a grid of 21 × 21 black and
white squares, with a border of a further 5 white squares in each direction. Each code
is unique and incorporates in it the feature size of the code, which is defined as the
size of the individual squares that make up the code as shown in Fig. 10.7.
10.3.2.2 Calibration
In order to use QR Codes as a standard test method apparatus, it is necessary to cali-

brate the behavior of the QR Code detection system for each camera. This establishes
the relationship between detecting a QR Code of a particular size and the acuity, as
defined above. This relationship can change between robots, camera systems and
experimental setups. For example, it is possible to generate QR Codes with varying
levels of error correction that enable the code to be read even when not all of the
squares have been identified. Conversely, different QR Code reading software will
require different levels of certainty before reporting that a code has been read. Fur-
thermore, different cameras will blur objects slightly differently as they approach
the limits of acuity and this may affect the QR Code reading software differently,
resulting in a different calibration factor for different cameras even though the QR
Code reading software might be the same.
Calibration is performed by integrating a resolution wedge into each of the QR
Code charts, as shown in Fig. 10.7, next to a QR Code with the same feature size as
the widest lines in the resolution wedge. A calibration step is performed by recording
video from the robot’s camera while moving the camera towards and away from the
chart. QR Codes are automatically detected in the frames of the resulting video. Those
associated with that calibration QR Code being barely detected are then inspected and
the visual acuity measured using the resolution wedge. The ratio of this measurement
to the feature size of the QR Code is the correction factor that should be applied to
the QR Code measurements to yield visual acuity, for this camera system and QR
Code reader.
10.3.2.3 Poster
The way in which the optotypes are presented to the robot make a significant dif-
ference to the performance of the robot under test. It is important to ensure that
the presentation of the optotypes is representative of the types of features that will
be important to the robot in its operational role to ensure that the resulting test is
relevant. For the ImPACT-TRC, posters like that shown in Fig. 10.7 were developed.
For visual acuity, the way in which the optotypes are printed on the test chart
can significantly affect the ability of the robot to observe fine detail. Three factors in
particular are important: The layout of the optotypes, the surface finish of the printed
posters and the contrast of the posters. The layout of the optotypes is important for
two reasons. First, optotypes should be sufficiently spaced apart that they will not
interfere with each other. QR Codes should be surrounded by white space (the “quiet
zone”) to ensure that they can be reliably read. The presence of other printing within
this space may cause the code to fail to read. Second, similarly sized codes should
be in close proximity so that the robot is likely to be able to locate codes that are of
an appropriate size to be decoded. The surface finish of the poster is also important.
The surface should be matte so that reflections of lights and other bright surfaces in
the environment do not interfere with the visibility of the codes. Finally, the contrast
of the poster should be appropriately matched to the background on which the poster
is placed. As shown in Fig. 10.8, a high contrast poster with a white background
becomes impossible to observe against a darker colored environment while a lower
contrast poster, with the same black printing but a darker background, is still readable.
10.3.2.4 Methodology and Statistical Significance
First, the calibration factor of the camera and test setup should be determined. This
number, typically between 0.3 and 0.6, represents the ratio of the actual acuity of the
camera, as defined by the smallest spacing between parallel black and white lines that
can be resolved, over the feature size of the smallest QR code that the camera and test
software can read at the same distance. This is done by placing the chart, illuminated
to at least 300 lux, against a light colored background. The robot should be placed
close to the chart, such that the QR Code associated with the calibration target is
clearly readable. The distance between the robot and chart should then be increased,
either by moving the chart or the robot, until that QR Code is no longer readable. This
process should be repeated at least 10 times. The frames should be inspected and
the resolution wedge of those where the calibration code is barely detectable should
be measured to determine the calibration factor as described above. Measuring the
calibration factor should be done with no “tough conditions”.
The Automated Visual Acuity controlled test should then be performed, both
under normal as well as the desired tough conditions. This is done by placing the
robot’s camera system at the far field test distance of 6,000 mm (6 m) from the chart.
The feature size of the smallest code that is read at least 28/30 times in consecutive
frames, multiplied by the calibration factor determined above, is used to compute
the metric under the following formula.
Fig. 10.7 An example of a QR code poster with integrated calibration targets. The feature size of
each code, as marked by the measurements (e.g. 2 mm) corresponds to the size of the small squares
that make up the code
f actor × f eatur e_si ze ◦

angular _r esolution ◦ = tan −1 ( calibration_
char t_distance
)
A baseline measurement should be taken under ideal conditions. The robot should
be stationary (and, for drones or cyber rescue canine equipment, placed on a stand)
and the chart should have a white background, placed against a white environment
and illuminated to at least 300 lux. There should be no wind, rain, dust or shadows
against the chart. Communications between the robot and control station should be
Fig. 10.8 Three posters with varying contrast levels. The high contrast (white) poster cannot be
read while the lower contrast poster is still readable against the dark green grass
the ideal case. The robot should use whichever camera is the best for inspection. The
test should then be repeated under various tough conditions to measure how they
affect the visual acuity of the robot. These can include:
• Idling (e.g. For a drone, hovering; for Cyber Rescue Canine, worn but stationary)
• Strong wind
• Rain, smoke, dust
• Difficult communications (e.g. Distance, radio interference)
• Different cameras (e.g. Driving or survey camera)
• Operators of different skill levels
Many of these posters, each containing the range of differently sized codes and
a unique numerical identifier, are then be embedded within an operational scenario.
They may be placed at regular intervals in locations that the robot is expected to
observe, as shown in Fig. 10.9 and/or may be concentrated around objects and areas
of particular interest. They represent the ability of the robot system as a whole,
when operated in a particular manner, to observe objects of different sizes in the
environment. Video from the robot’s cameras are recorded and analyzed to determine
all of the observed QR Codes. Such a use of this test allows for aggregate measures
such as how much of the environment was observed at a different levels of acuity.
It also allows for specific location measures, which reports the level of acuity with
which specific objects of interest were observed (and if this was sufficient).
Fig. 10.9 Examples of large QR Code posters placed on objects of interest within an environment
10.3.2.5 Applicability
We have evaluated this test method and validated the calibration procedure at four
separate events: A test field near Curtin University in Perth, Western Australia; the
University of Queensland Robotics Design Studio, Indoor-Outdoor Flight Testing
Lab in Brisbane, Queensland, Australia; the Robot Test Facility at the National
Institute of Standards and Technology, US Department of Commerce, Maryland,
USA; and the Fukushima Robot Test Field, Fukushima, Japan. We performed these
calibration validations using three different aerial platforms plus several other camera
systems. We have also collected test data of both the Automated Visual Acuity
controlled test as well as the embedded test.
The Automated Visual Acuity test method is found to be generally applicable
to robots that have a significant precision inspection application. Examples include
those that must inspect for cracks in buildings, labels on packages or the status
of valves and gauges. The significance of the metric should be validated against
these operationally significant objects of interest. This is done by embedding the
posters near operationally significant objects and noting the acuity measurement
when domain experts, such as responders, consider the camera image to be good
enough to perform the task. This should be repeated to yield a distribution of acuity
measurements that represent the ranges of acuity required for that task. For example,
to read standard 10 cm hazardous materials placards, it is generally necessary to have
a visual acuity of approximately 2 mm.
A limiting factor on the applicability of this test method is the need for the QR
Code posters to be quite large compared to the size of objects that the test represents.
For example, for a robot that should inspect objects down to 5 mm in size, with a
typical calibration factor of around 0.3 to 0.6, the QR Code to test the limit of visual
acuity should be approximately 26–52 cm wide. As defined here, this test is less
applicable for robots and cameras intended for wider area survey and reconnaissance.
Alternative codes that do not require as large an optotype relative to the measured
feature size may be more useful in such situations.
10.3.2.6 Documentation and Standardization
The Automated Visual Acuity standard test method procedure, apparatus and metric
will be incorporated into a future update of the Visual Acuity test method, currently
designated ASTM International Subcommittee E54.09 on Response Robots Stan-
dard Test Method E2566-17a Standard Test Method for Evaluating Response Robot
Sensing: Visual Acuity.
10.3.3 Conclusion
In this section, we have evaluated the suite of Standard Test Methods for Response
Robots, developed as part of the DHS-NIST-ASTM International Standard Test
Methods for Response Robots and under the ASTM International Subcommittee
E54.09 on Homeland Security Applications: Response Robots. In evaluating the
impact of STM on the ImPACT-TRC, we have determined that visual sensing is a
common, critical component of most of the tasks of the robots in ImPACT-TRC.
Therefore, we have selected Visual Acuity to extend, as a way of evaluating the
impact of tough conditions on robot capabilities.
To evaluate the toughness of visual sensing capabilities, it is necessary to measure
the Visual Acuity of the robots under both normal and tough conditions, as well
as regularly during operational scenarios. The existing Visual Acuity test method
requires manual evaluation of the robot’s observation of the test artefacts and thus is
impractical for the regular measurements required.
We have developed a new procedure for using QR Codes as optotypes for eval-
uating Visual Acuity in an automated fashion. This includes the development of
combined posters of QR Codes of different sizes, along with procedures for using
them to calibrate the Visual Acuity measurement against existing standards. We
have validated the calibration method at sites in Australia, the United States and
Japan. We have also collected data at events in all of these sites, most recently at the
ImPACT-TRC Evaluation Event in Fukushima. We are now in the process of contin-
uing to collect data to validate this new procedure and standardising this Standard
Test Method through ASTM International Subcommittee E54.09. We will also be
performing additional data gathering and validation at the Fukushima Robot Test
Field during the ImPACT-TRC Final Evaluation Event.
Acknowledgements This research was funded by ImPACT Tough Robotics Challenge Program
of Council for Science, Technology and Innovation (Cabinet Office, Government of Japan). We
acknowledge to Adam Jacoff of NIST and his STM teams for the valuable comments on the STM
applications. We also thank the following safety experts who provided technical advices on safety
management: Ryuichi Okamura, Tsutomu Nagi, Masashi Okuda, and Koji Oga.
References
1. Ambe, Y., Yamamoto, T., Kojima, S., et.al.: Use of active scope camera in the kumamoto
earthquake to investigate collapsed houses. In: Proceedings of the 2016 IEEE International
Symposium on Safety, Security and Rescue Robotics, pp. 21–27 (2016)
2. ASTM, Subcommittee E54.09: published standards under E54.09 jurisdiction. https://www.
astm.org/COMMIT/SUBCOMMIT/E5409.htm (2018). Accessed 23 July 2018
3. Department of Defense of US, MIL-STD-882E System Safety (2012)
4. Disaster City home page. https://teex.org/Pages/about-us/disaster-city.aspx. Accessed 15 April
2018
5. European Union, Directive 2006/42/EC of the European Parliament and of the Council of 17
May 2006 on machinery, and amending Directive 95/16/EC (2006)
6. Fukushima Robot Test Field home page. https://www.pref.fukushima.lg.jp/site/robot/.
Accessed 11 Sept 2018 (in Japanese)
7. Hamada, R., Ohno, K., Matsubara, S., et.al.: Real-time emotional state estimation system for
canines based on heart rate variability. In: Proceedings of IEEE Conference on Cyborg and
Bionic Systems, pp. 298–303 (2017)
8. Igarashi, H., Kimura, T., Matsuno, F.: Risk management method of demonstrative experiments
for mobile robots in outdoor public space. Trans. Robot. Soc. Jpn 32(5), 473–480 (2014). (in
Japanese)
9. Ikeda, et.al.: Development of template of risk assessment sheet for personal care robots. In:
Proceedings of the 29th Conference of Robotics Society Japan, RSJ2011AC2B1-1 (2011). (in
Japanese)
10. Implusing Paradigm Change through Disruptive Technologies Program (ImPACT), Tough
Robotics Challenge(TRC) home page. http://www.jst.go.jp/impact/en/program/07.html.
Accessed 15 April 2018
11. International Organization for Standardization, ISO 12100: 2010 Safety of machinery – General
principles for design - Risk assessment and risk reduction (2010)
12. International Organization for Standardization, ISO 12233:2017 Photography – Electronic still
picture imaging – Resolution and spatial frequency responses (2017)
13. International Organization for Standardization, ISO/IEC 18004:2015 Information technology
– Automatic identification and data capture techniques – QR Code bar code symbology spec-
ification (2015)
14. Jacoff, A., et.al.: Guide for Evaluating, Purchasing, and Training with Response Robots
Using DHS-NIST-ASTM International Standard Test Methods. http://www.nist.gov/el/isd/ks/
upload/DHS_NIST_ASTM_Robot_Test_Methods-2.pdf (2014)
15. National Aeronautics and Space Administration(NASA) in US, System Engineering Handbook
Rev. 2 (2016)
16. Sheh, R., et.al.: The response robotics summer school 2013: bringing responders and
researchers together to advance response robotics. In: 2014 IEEE/RSJ International Conference
on Intelligent Robots and Systems (IROS 2014), pp. 1862–1867 (2014)
17. Tadokoro, S., Uchizono, T.: ImPACT tough robotics challenge. In: Proceedings of the 2017
JSME Conference on Robotics and Mechatronics, 1P1-R01 (1–2) (2017). (in Japanese)
18. Yoshida, H., Yokokhoji, Y., Konyo, M., et al.: ImPACT tough robotics challenge (TRC) con-
struction robot - field evaluation experiments using double swing dual arm model. In: Proceed-
ings of the 2018 JSME Conference on Robotics and Mechatronics, pp. 2A1–J02(1)–(4) (2018).
(in Japanese)
Chapter 11
User Interfaces for Human-Robot
Interaction in Field Robotics
Robin R. Murphy and Satoshi Tadokoro
Abstract This chapter proposes thirty-two guidelines for pro-actively building a

good human-robot user interface and illustrates them through case studies of two
technological mature ImPACT Tough Robotics Challenge (TRC) systems: the cyber
K9 and construction robot projects. A designer will likely have to build three notably
different interfaces at different points in the development process: a diagnostic inter-
face for developers to monitor and debug the robot using their expert knowledge,
an end-user interface which is tailored to the tasks and decisions that operator and
knowledge workers must execute, and an explicative interface to enable the public
to visualize the important scientific achievements afforded by the robot system. The
thirty-two guidelines are synthesized from the human-computer interaction, human-
robot interaction, and computer supported coordinated work (CSCW) groups com-
munities are clustered around four general categories: roles, layout appropriateness,
the Four C’s (content, comparison,l coordination, color), and general interaction
with, and through, the display.
11.1 Introduction
This chapter proposes thirty-two guidelines for pro-actively building a good human-
robot user interface. The guidelines can be applied while the developer is building
the robot rather than having to wait for the final version to be completed. They
are illustrated by case studies of two ImPACT robot systems: the cyber K9 and
construction robot projects.
The guidelines stem from an ecological view of human-robot interfaces [2], rather
than an user-centric view, because most field robots are being developed for formative
work domains. Formative work domains are ones which the technology and how it
R. R. Murphy (B)
Texas A&M University, College Station, TX, USA
e-mail: robin.r.murphy@tamu.edu
S. Tadokoro

https://doi.org/10.1007/978-3-030-05321-5_11
508 R. R. Murphy and S. Tadokoro
will be used is not yet established. High impact, innovative robots are formative by
definition because they either perform completely new functions or perform functions
that humans do, but in very different ways (e.g., a roomba vacuum cleaner versus a
maid). As such, it is difficult to perform a speculative work analysis until the robot
and interface is built at least to the level of a prototype. But if the user interface
is poor, then tasks will not form because the end user is thwarted in exploring and
exploiting the robot’s potential. Thus any work analysis will be flawed. This leads
to a cycle: a designer needs both a robot and a good interface but in order to build a
good interface, there needs to be an existing good interface.
The chapter postulates that a designer will likely have to build three notably differ-
ent interfaces at different points in the development process: a diagnostic interface for
developers to monitor and debug the robot using their expert knowledge, an end-user
interface which is tailored to the tasks and levels of comprehension that operator
and knowledge workers must provide, and an explicative interface to enable pro-
gram managers, the press, and the general public to visualize the important scientific
achievements afforded by the robot.
The chapter is organized as follows. It first presents three reasons why user inter-
faces for robots are different, and harder to design, than computer interface. Next,
it describes three types of user interfaces and their importance during phases of
technical maturation. The thirty-two guidelines are then introduced. As diagnostic
interfaces are what roboticists naturally build for themselves and are less vulnera-
ble to interface problems, the other two categories, the execution interface and the
explicative interface, are discussed. The chapter then focuses on two case studies,
referring back to the guidelines.
11.2 Why User Interfaces for Robots are Different

Than for Computers
Human-computer interfaces (HCI) have been studied for decades, but while HCI
principles such as Shneidermans’ Eight Golden Rules [33] and Nielsens’ 10 Usability
Heuristics for User Interface Design [25], are necessary for good robot interfaces,
these principles are not sufficient for at least three reasons described below. Thus,
additional principles are needed for robotics to establish a framework. Furthermore,
robots present a major challenge for simply applying existing HCI design rules: There
is rarely a single, or canonical, “end-user.” Prior field work in robotics, especially
[1, 24, 30] has identified user roles beyond that of the operator. Good HCI principles
state that interfaces should be tailored for the role of the end-user, so a robotics
developer has to design an interface for each of the multiple roles, not just for the
operator. This chapter goes further and posits that there are three major categories
of robot interfaces. One is the End-User interface that permits execution of the robot
and is the aspirational user interface. But that interface is often really a diagnostic
interface used by the developers to control and debug the robot, which the designer
11 User Interfaces for Human-Robot Interaction in Field Robotics 509
mistake as being satisfactory for the operators. A third interface is needed to allow
sponsors, program managers, the press, and the general public to visualize what the
robot is doing and explain what are the scientific achievements of the system, similar
to how IBM Watson displayed its ranking of choices while playing on Jeopardy!
11.2.1 Differences Between Human-Robot Interaction

and Human-Computer Interfaces
There are at least three major differences between human-robot interaction and HCI.
The differences mean that HCI design guidelines, often called rules, heuristics, or
principles, are incomplete.
One difference is that the interface is mediating an active relationship between a
human and a physically situated agent that is not a peer. Unlike a computer, which
is generally a tool that works in the ether, a robot is physically situated in the real
world. Unlike a human, who is physically situated in the real world, a robot does not
have the same range of cognitive and physical abilities and thus cannot be treated as
a peer. The human will be actively directing the robot, perhaps not continuously, but
the human is always responsible for the correct operation of the robot and thus the
relationship is active.
As noted in [15], there are aspects of human robot interfaces which seem to be
similar to computer supported work groups (CSCW), which is also called groupware.
This is especially true of the challenges in human teams of coordinating with, or del-
egating to, other individuals who may have differing cognitive abilities. The interface
may be designed to support the human in applying their knowledge or expertise or
using historical data beyond the immediate data being provided by the individual
members of the group in order to meet the larger goals of the mission.
But human-robot interfaces are different from groupware. The human is more
closely, and explicitly, directing the robot than would be done with a human subor-
dinate and has more responsibility for the robot than a human group member. The
human is responsible for the robots’ health and safe operations, which normally
would be only a secondary responsibility of the human director. Human-robot inter-
faces are different from groupware because it may be much harder for the human
to direct, control, or anticipate problems with the robot. For example, a human may
make errors controlling an agent that has four legs or moves like a snake.
A second reason why HCI principles are different and thus insufficient for robotics
is that field robots tend to be used for high consequence missions with high cognitive
activity loads. As a physically situated agent, the robot impacts the environment,
either moving through it or altering it; therefore, the robot has consequences. Unlike
computer interfaces where actions can be rolled back with undo commands following
the HCI tenet of reversibility [33], a robot cannot undo a collision or other disastrous
action. The potential for negative consequences increases with high consequence
missions, such as disaster response. Any action during a disaster response is high
consequence, as the timely acquisition of information about the event or an interven-
tion to mitigate the incident can impact lives and economic recovery. An error by the
robot can make the situation worse, such as when a robot failed at the Pikes’ River
New Zealand mine disaster and blocked the only access to the mine [22].
The activity load is higher than for non-robotics applications because of the
increased mental resource, perceptual and team work demands, following
Wickens’ Multiple Resource Theory [38]. In terms of mental resources, every mis-
sion involves two missions: One is the safe mobility of the robot and the other is
the actual mission of why the robot needed to move to either get data or manipulate
the environment. Thus an operator will have to expend more mental resources to
direct a robot than a computer program. In terms of perceptual demands, controlling
a robot generally involves not only perceiving the state of the robot as displayed by
the interface, which is standard for HCI, but also perceiving the world through the
robot, as mediated by the interface, which is not a typical goal of HCI or groupware.
In terms of team work demands, the human has to work with the robot, who may
not be a good team player (see Bradshaw [13]). The human is likely to be working
with experts who want to consume or direct the flow of data from the robot as per
[21]. In some instances, the robot itself may have an implicit social relationship with
a victim; this adds to the team work resource demands.
A third reason why robot interface are different from computer user interfaces
because there is rarely a single, representative end-user for a field robot and thus
a developer must design multiple interfaces. Since 2002, studies have shown that
disaster robots are more effective if one person operates or drives the robot while a
second person manages and interprets the sensor data [1, 4, 7, 24, 30]. The driver or
robot-oriented role is called the Operator or Pilot while the mission-oriented role has
been called the Mission-Specialist [1, 24, 30], Payload Specialist [7], Knowledge
Worker [21]. Each role merits a unique user interface, one that is explicitly tailored
to the role [33]. Peschel and Murphy [30] showed improved performance, and more
importantly increased comfort, with giving a responder an interface to a UAV that
eliminated the data used by the pilot and added widgets with mission-relevant infor-
mation.
11.2.2 The User Roles of Developers and the Public
Robots often engage two additional roles beyond the End-User as developers or as
the Public. Consider that robot interface development differs from computer soft-
ware interface development because robotics introduces a physical component. The
physical robot is often speculative and being developed with a spiral lifecycle, inter-
fering with creating a suitable interface. But the early stages of the development of
the hardware often requires a good user interface, just not for the as yet non-existent
End-User. Certainly, an interface is needed for the developer to diagnose the state of
the robot and make sure the hardware and software is working correctly. In addition
to the developer, the project will most likely demonstrate robot progress to sponsors,
contract managers, and even the press. These demonstrations may be crucial to the
overall project but developers often attempt to use the interface intended for them,
as an expert designer with a high degree of familiarity with the system, to explain
the innovation or state of the system. Unfortunately, these members of the Public
have a different background and different goals and may not be able to comprehend
the display. Good HCI principles state that since the Public has a different role, they
merit a different interface.
11.3 The Three Types of User Interfaces in Human-Robot

Interaction
The first step in designing a human-robot interface is to decide which role the interface
is intended to support: Developer, End-User, or Public. This section defines each of
the roles and the cognitive work that the interface should enhance. The cognitive
work is characterized using the five areas for a cognitive work analysis following
Vicente [36], which are: a description of the work domain, the strategies used to
accomplish the work, critical control tasks, the worker competencies, and the social
organization and cooperation of the work.
Figure 11.1 provides a visualization of the user interfaces that a robotics project
will entail. Note that the taxonomy indicated that a developer should expect to design
and implement a minimum of four interfaces, possibly more depending on the number
of different mission-oriented End-Users. Fortunately, the designer does not have to
build all of the interfaces at one time, there is a natural progression of interfaces
through the phases of technical development.
11.3.1 Developer Diagnostic Interface
The Developer Diagnostic Interface is the default interface; it is intended for, and
built by, the developer. The work domain, the strategies, and control tasks all focus
on enabling the developer to notice any deviations in the robots’ functionality, spot
precursors to a problem or unsafe condition, and to be able to suspend or stop the
robot in a safe state. In order to accomplish this, the interface will likely have to
Fig. 11.1 Taxonomy of the

categories of human-robot
interfaces
display the internal state of the robot and provide a visualization of the comparison
of the robot’s true state with either it’s expected state or a ground truth. The worker
competency of the Developer is very high, most likely with advanced degrees in
engineering and robotics as well as an intimate knowledge of the robot.
Widgets in the Developer Diagnostic display may be reused for the End-User
displays. However, note that while the developer’s operation of the robot may mimic
the control tasks and strategies used by End-Users, the developer is rarely an expert
in the work domain and know exactly what the End-User is trying to do and how they
do it. Likewise, the developer is not guaranteed to have access to high fidelity social
organization and cooperation scenarios, so the interface is likely to be un-informed.
Thus an unmodified Developer Diagnostic Interface is unlikely to be effective as
an End-User interface. Figure 11.2 is an example of a developer display that is
adequate to observe system perform details but would not be amenable to execution
or explication.
11.3.2 Public Explicative Interface
The Public Explicative Interface is intended for project managers, sponsors, contract
administrators, and the press. The work domain, the strategies, and control tasks all
focus on enabling the audience to observe the overall actions of the robots, to under-
stand why those actions are being taken, and to appreciate any scientific innovations.
Fig. 11.2 A diagnostic display for the cyber-K9 project

The Public is unlikely to touch or operate the robot except in extremely scripted
conditions. In order to enable the Public’s observation of the relevant science, the
interface will likely have to display a portion of the internal state of the robot and
provide a visualization of the comparison of the robot’s true state with either its
expected state or a ground truth. The interface must direct attention to the innova-
tions and correct functioning of the robot, thus it may have fewer widgets and icons
than either a Developer or End-User interface. However, the interface may require
substantial effort to portray the software operation and autonomy in the robot. The
worker competency of the Public is generally very low; although they may have a
background in engineering or are representatives of the End-Users.
The Public Explicative interface will likely have “extra”? widgets that are not
used in either the Developer Diagnostic interface, because they may be too simple
or abstract to be of value to the developer, or the End-User interface, because they
explain in detail what is happening but the End-User may not need that detail for
making decisions in the expected operations tempo. The Public Explicative interface
will also tend to have larger and bolder fonts to enable an audience to see the display
from a distance; they are unlikely to be sitting in front of a laptop. Figure 11.3 shows
a explicative interface where the graphs add no value to the End-User but illustrate
what is going on internally in the system.
Fig. 11.3 The explicative display for the cyber-K9 project, clearly showing the operation of different
functions as graphs coordinated with the movement of cameras and path
11.3.3 End-User Execution Interface
The End-User Execution Interface is intended for each end-user, and there could be
multiple types of end-users, to execute the mission. The work domain, the strategies,
and control tasks all focus on enabling the End-User to safely control the robot,
manage its health, and conduct the mission. The interface must enable the End-
User to quickly determine and instigate the correct actions, for safe movement of
the robot or to maintain a fast operations tempo of the mission. Thus the interface
widgets should aim to provide directly perceivable affordances of the state of the
robot or potential consequences. Also the interface is unlikely to display internal
state data, only “go, no-go” indications. The interface may allow the end-user to
diagnose problems or explore but those would be secondary, and rare, operations
and thus not highlighted on the display. The worker competency of the End-User
is generally very high for the mission, lower for the robot; end-users will not have
advanced degrees in engineering and robotics.
The End-User interface will likely have icons that translate the data in the widgets
in the Developer Diagnostic interface into affordances (e.g., red, yellow, or green
colors, status bars, etc.). The relationship between widgets or windows must be made
to be visually obvious in order to reduce the cognitive load on the End-User. The End-
User interface is the most difficult to build because it requires a working prototype of
the robot, an initial interface, and access to high fidelity field conditions, and multiple
end users for a domain analysis. Figure 11.4 shows a view of the execution display
for the canine team leader.
Fig. 11.4 The team leader execution display for the cyber-K9 project
11.3.4 Importance of Each Interface Through the Project

Development Lifecycle
A designer does not have to build all three interfaces simultaneously; the need for
each interface emerges as the robot technology matures. This section will utilize the
NASA Technical Readiness Levels (TRL) model as the reference for technological
maturity; the levels are shown on the left of Fig. 11.5.
Figure 11.5 shows the rise of importance and associated effort of each category
of interface as the technology matures. The effort that is put into the design and
refinement of this diagnostic display is represented by the intensity of the gray fill;
darker means more work. The first interface will be Developer Diagnostic interface
for the designers to use. A successful interface enables the developer to comprehend
the state of the robot and diagnose problems. The effort needed to build that interface
and the need to rely on that interface is lessened as the technology matures and
becomes production grade around Level 8 and 9. At the Level 8 or 9 point, the
technology should not need constant diagnostics or tweaking.
As the design process moves beyond implementation and debugging work and
becomes more tangible, the designer may need to demonstrate progress to the Public
starting around Level 4. At earlier conceptual levels of development, the scientific
advances may be too abstract or theoretical to attempt to visualize. The demonstration
of progress can be either simulation, with a robot prototype, or both. The need to
develop and refine the Public interface diminishes as the robot becomes more mature,
as the utility and innovation should be much more apparent. A successful interface
enables the Public to understand the scientific advance.
Starting as early as Level 5, the project may have a workable robot and thus can
begin work on the End-User interface. The designer will most likely concentrate
on the Operator role so as to co-develop fundamental features for controlling the
Fig. 11.5 Time line of interfaces as a function of technological maturity of the robot system. TRL
metrics Courtesy of NASA
robot. As the technology matures, the interfaces for the Mission Specialist or other
roles can be added, leading to a complete system. However, the End-User interface
may have to go through several iterations because the usability by the end-user is
hard to predict in formative domains and the real missions and control strategies are
still in flux.
11.4 Thirty-Two Guidelines for HRI User Interfaces
This section presents 32 guidelines for human-robot interaction user interfaces, syn-
thesized from the human-computer interaction, HRI, and computer supported work-
groups communities. The guidelines will be discussed in more detail in the following
sections, as the applicable guidelines depend on the context of the intent, role, inter-
face technologies, and other factors. It is impossible to create specific unbreakable
rules for designing user interfaces because each domain and user characteristics may
be very different. Guidelines are sometimes called heuristics to emphasize that they
are for the general case but may need to be tailored or relaxed for individual cases.
The guidelines are grouped into four general categories: roles (3 guidelines), layout
appropriateness (7), the Four C’s of content, comparison, coordination, and color
(18), and general interaction with, and through the display (4). The guidelines are
numbered so as to permit easy cross-referencing in later portions of this chapter.
The three guidelines for designing for roles are:
G1.1 Identify the user roles (Developer, End-User, or Public) and associated
type of user interface (diagnostic, execution, explicative), then design accord-
ingly [30]. Remember that there may be more than one End-User role.
G1.2 Explicative interfaces should avoid picture-in-picture (PIP) displays or
selectable windows with the expectation of switching views because the Public
will not be able to control the views. PIP may work if the presenter manually
switches views as part of the demonstration.
G1.3 Explicative interfaces should eliminate any diagnostic or operator-
centric content that does not illustrate the important scientific contribution being
highlighted. The extraneous content may distract the Public.
The seven guidelines for layout appropriateness of windows and widgets on a

display builds on the original work by Sears [32] and are presented below:
G2.1 Design for the footprint of the target display device. If there are multiple
possible display devices then each may require a different layout.
G2.2 The primary window should be largest, most captivating, and preferably
most central. If there are multiple windows, there will be a window that the user
looks at the most often for a task; this is the primary window and the remaining
windows are secondary. The most important information should be in the primary
window and have the largest display space, as that is where a user will look first
(i.e., large means important) and probably in more detail (i.e., large means more
room to put important information in readable fonts or discernible icons).
G2.3 Avoid overlays or using pop up dialog boxes unless it is an absolutely

critical event. Pop up dialog boxes run the risk of popping up over something
important in a video feed or information window that would allow the user to
comprehend what is the problem and how to fix it [11]. Having a dedicated status
window for warnings and error messages is preferable.
G2.4 Group windows and widgets with spatial or semantic correspondence
together. As supported by [5, 6, 10, 28, 37], a video view from the robot’s
left arm would appear on the operator’s left and any left arm information would
appear in a connecting window or status bar, preserving the special and semantic
correspondence.
G2.5 Place windows and widgets to be consistent with viewing behavior. For
single display, multi-window devices, people scan left to right or consistent with
reading and writing patterns. For multi-display environments, The studies have
shown that people typically view the upper rows of an MDE, generally the center
row, though sometimes the top row. They typically scan horizontally from left to
right, particularly if there is a temporal sequence [5, 12, 29, 35, 37]. Less frequent
or exceptional information should appear in a display, window, or status above or
below the displays used for the working sets for the tasks because people don’t
naturally look at the lower and upper rows [16, 18] and specifying the location of
pop dialog boxes in “active” display is hard [11].
G2.6 Avoid visual overload, especially having too many windows or overlaying
writing on images. A viewer may not be able to process a complex image plus
additional information, thus it is helpful to have multi-modal alternatives such as
sound and vibration.
G2.7 Design for dynamic content if needed. Robot missions rarely consist of
one task [22], so the layout and content of the displays or the modes of interaction
may need to change to support different tasks.
The Four C’s (Content, Coordination, Comparison, Color) is a mnemonic created

in this chapter to help designers remember the importance of content, coordination
of windows so that together the display makes sense, enabling the viewer to readily
make comparisons, and to use color wisely. The 18 guidelines are as follows:
G3.1Content
G3.1.1 Only one information or functional purpose per window or wid-

get. This is intuitive, but reinforced by studies by [9, 16]. The exception is a
diagnostic window which might present multiple data streams.
G3.1.2 Avoid using overlays of multiple images or icons. Note this is different
than synthetic or augmented reality used in mobile robot displays such as [27].
G3.1.3 Consider what is the point of view of the content and how to best
represent that in context. A point of view is usually egocentric, either through
the robots sensors or of the robot’s internal view or state, or exocentric, with an
external view of the robot and possibly its surroundings.
G3.1.4 Maintain consistency with expectations of the interface based on
look-and-feel with existing apps. This is one of the 8 Golden Rules of Interface
Design [33].
G3.1.5 Be aware of, and attempt to minimize, unintended “visual capture”.

Videos, animations, or even augmentation of images may capture the users’
attention even if content is not relevant and thus distract them from more impor-
tant information.
G3.2 Comparison
G3.2.1 Explicitly indicate “normal” and “off-normal” behavior on graphs

that are conveying the deviation of behavior, data, measurements, etc. This can
be with dashed lines indicating acceptable limits, colored regions or labels where
green is normal, red is bad or anomalous, and yellow is transitioning to bad.
Public users will not have sufficient understanding of the phenomenon to know
internally what is good and what is bad, so will need explicit keys. End-users
may understand the data but not have the cognitive resources to decipher the
graph, so will need affordances to directly indicate the condition.
G3.2.2 Use backdrops, especially 3D grids to represent complex shapes in
3D space. A complex 3D simulation of a shape need grids, including tick mark
lines, to provide scale and depth cues [14, 19].
G3.2.3 Use side by side icons with symmetry axes to show differences in
shape. It is more effective to compare shapes by placing the shapes side-by-
side rather than superimposing one shape on the other [19]. However, the user
will need more than just 3D axes to easily determine what is different. In those
cases, symmetry axes which cut through the shape may be more effective [34].
G3.2.4 Represent points of contact with either vectors or contact patches.
Point of contact between the robot and the physical world can be difficult to
see in a simulation. A vector emanating at the point of contact can help the
user see the exact point and the direction of the forces [8]. A patch, or square,
representing the project contact area (and relative angle to the robot) may be
more effective than a vector.
G3.2.5 If comparing two shapes, a less detailed rendering is easier to com-
pare the two shapes. Details distract from the main purpose of highlighting
the differences [17, 34]. For example, a simple set of linked bars may be more
effective than a photorealistic rendering of a complex robot and its ideal shape.
G3.3 Coordination
G3.3.1 Windows that contain different points of view of semantically related

data need to be explicitly coordinated. For example, users may not be able
to make the connection between the direction of a robot’s camera eye view of
a scene and where the robot is facing on a map or in North/South/East/West
cardinal directions, or have to spend considerable cognitive effort to mentally
compute the relationship. Ideally the windows or displays would be automati-
cally coordinated [28, 31, 37].
G3.3.2 If one or more of the perspectives are semantically related but not
automatically coordinated or the perspectives are tracking in different
directions, then an explicit indication is needed, such as a proxy and ori-
entation arrow [28, 31, 37]. For example, a vector on the icon of a robot on a
map can indicate the direction the robot’s camera is pointing.
G3.3.3 Maintain consistency (use of color, same fonts, same line type, etc.)
across windows and widgets. This is another instance of the 8 Golden Rules
[33]. For example, having unique labeling characteristics for each window
implicitly suggests that content is totally isolated and not related to other win-
dows.
G3.4 Color
G3.4.1 Less is more. Following Miller’ s rule [20], 7 ± 2 items is the maximum
users will likely be able to remember what they signify. An expert trained on
the interface can mentally group items into a chunks and thus remember more,
but in general users will quickly form chunks.
G3.4.2 Reserve red, green for “bad” and “good”. These are the common
interpretations of these colors consistent with the 8 Golden Rules [33] and
allow to immediately understand the situation rather than have to cognitively
recall semantics and infer relationships, violating the 10 Heuristics [25].
G3.4.3 Use colored fill rather than colored fonts. Colored fonts are usually
difficult to read, either the colors are hard to see (ex., yellow font on a white
background) or the colors are not distinct enough (ex. black and dark blue).
G3.4.4 Represent continuous valued data with an intensity gradient to show
the degree of membership or severity. Data is generally discrete or contin-
uous. If continuous and there is no natural thresholds, then a color gradient may
be better to convey change than a series of colored blocks.
G3.4.5 Texture, such as a gradient pattern, can help users discern the ori-
entation of a colored patch [8].
The four guidelines for general interaction with, and through, the display are:
G4.1 Remember the Nielsen’s “Powers of 10” in user interfaces. The powers
of 10 are summarized here [26]. A user will expect the computer to respond
to commands within 0.1 second, get impatient at 1 second, and assume there is a
problem if there is a 10 second delay. Users expect to complete simple tasks within
1 min. Therefore, the End-User interface design must support this. A member of
the Public may spend less than 10 min viewing a demo or display, reinforcing the
need for immediately comprehensible explicative interfaces.
G4.2 Minimize the number of clicks (or pull down menus) needed to reach a
goal [3]. Users prefer one-click for frequently selected functions, consistent with
the Powers of 10.
G4.3 Design for transitional users. Remember that the user may be transitional:
they start out as a novice and then may want to expert functions or shortcuts as they
become more proficient. An expert-oriented interface will overwhelm a novice.
Thus, the interface should allow for power-keys or short-cuts.
G4.4 Permit users tailor the display to their preferences, where this does not
introduce pre-conditions for human error [33].
11.5 Guidelines for End-User Interfaces for HRI
A robot interface normally refers to the execution interface employed by an End-User.

The most important of the role-related guidelines (G1.1–3) is to capture all the end-
users. For example, a robot may have an operator or pilot but also a mission or payload
specialist to interpret perceptual data and there may a safety officer required by
the application [23]. In a disaster, representatives from different emergency support
functions may be viewing the same data from a robot, each looking for different things
[22]. The point is that the designer should be considering the end-users in the overall
work domain. Layout appropriateness of windows and widgets (G2.1–2.7) is critical
to the end-user being able to effectively, and without error, perform their role. Only
three of the Four C’s (G3.1–3.4) may be relevant to most robots. Content (G3.1.1–
3.1.5) is obviously important. However, in many cases the guidelines for comparison
(G3.2.1–3.2.5) will not be applicable because the end-user is not comparing the
current state with an ideal state, but rather trying to comprehend the current state
in order to take the next action. Coordination (G3.3.1–3) is also very important as
innovative robots will have multiple sensor views, a map view, icons of the state
of the robot, etc. These views much be coordinated so that they do not overwhelm
the viewer. Color (G3.4.1–3.4.5) is also important as the judicious use of color can
reduce cognitive load and improve reaction times. All four of the general interaction
guidelines are essential (G4.1–4.4).
Possibly the most common violations of the guidelines for execution interfaces
are:
• Creating a layout that places windows and widgets so that they fill display space,
possibly in an attempt at attractiveness, but does not provide semantic cohesion
or exploits natural viewing patterns, that is, violates G2.4 and G2.5. This can also
stem from trying to reuse a developer’s diagnostic display without rethinking the
design.
• Not indicating what what is normal and not normal, again violating G3.2. Forcing
End-Users to have to reason or recall what constitutes the state of the robot or
processes means that they cannot respond fast enough.
• Using too many colors or colors inconsistently thus resulting in a confusing visual,
that is violating G.3.4.
11.6 Guidelines for Explicative Interfaces for HRI
The explicative interface aims to highlight internal states of the robot and to visualize
what are the unique features of the robot. All three of the role-related guidelines
(G1.1–3) are important, especially G1.3 which emphasizes the point that diagnostic
displays for a PhD roboticist who has been working with the robot for years may
not be effective for explaining to a general audience who has never seen the system
before. Layout appropriateness of windows and widgets (G2.1–2.7) is as critical to the
Public as it is to an end-user. All of the Four C’s (G3.1–3.4) are relevant to explicative
interfaces. Unlike End-User displays, the guidelines for comparison (G3.2.1–3.2.5)
may be the most important heuristics. The general interaction guidelines (G4.1–4.4)
are less relevant because the Public is unlikely to be directly engaging the interface.
However, Nielsen’s Powers of 10 (G4.1) suggests than the audience may not devote
more than 10 min of their attention to the robot, reinforcing the need for the interface
to be simple and focused on explication.
Possibly the most common violations of the guidelines for explicative interfaces
are:
• Creating a layout that places windows and widgets so that they fill display space,
possibly in an attempt at attractiveness, but does not provide semantic cohesion
or exploits natural viewing patterns, that is, violates G2.4 and G2.5. This can also
stem from trying to reuse a developer’s diagnostic display without rethinking the
design.
• Providing confusing comparisons, such as superimposing complex shapes or using
representations whose details distract from the comparison, this is violates G3.2
• Not indicating what what is normal and not normal, again violating G3.2.
• Using too many colors or colors inconsistently thus resulting in a confusing visual,
that is violating G.3.4.
Figure 11.3 shows an explicative interface that conveys the internal computations
of algorithms without violating any of the guidelines.
11.7 High TRL Example: Cyber K9
Figure 11.6 shows the cyber-enabled canine. The canine carries a camera, two-way
audio, GPS, and a wi-fi transmitter. The human handler for a canine is able to interact
with the dog through a smartphone. The team leader in charge of the multiple dog
teams that may be deployed to simultaneously search different areas would use a
tablet or laptop. This section concentrates on the interface for the team leader End-
User and provides a case study of a) layout appropriateness for the device footprint
and needs of the role and b) the use of color to coordinate semantically related
information about different dogs.
The cyber backpack is at a high TRL. The goal was design an explicative user
interface and two End-User interfaces. There were two attributes that should make
designing user interfaces easy. One is that the roles were known, that there were
dog handlers who wanted to interact with their dog and a team leader who needed
to maintain awareness of the dogs and their progress, help interpret what the dog is
signaling, and provide assistance to individual human-canine teams. The need for the
team leader to manage multiple human-canine teams made designing the interface
hard.
Fig. 11.6 Dog with cyber K9 backpack (orange)
Fig. 11.7 Dog handler interface
11.7.1 General Interface Description
Figure 11.7 shows the dog handler’s user interface. It is divided into two major
windows: a dog-centric window with tabs for each active dog team so that the handler
can coordinate and share with other teams and a map-centric overview of the selected
dog’s path and activities. Each window has icons for additional functions (take a
picture, place an icon on the map, bring up more functions, etc.) in a bar underneath.
The gray border emphasizes the semantic grouping of the window and icons, while
the color border of the dog emphasizes that set of semantic groupings.
11.7.2 How It Illustrates the Guidelines
As described below, the interface highlights guidelines from Sect. 11.4. This section
is organized by the four categories.
In terms of roles, the designers identified the End-User (team handler) and exe-
cution as the type of display (G1.1).
In terms of layout appropriateness, the display is sized for a tablet or a laptop
(G2.1). The view of the dog camera is the largest window as that is the primary
function (G1.2). Note that it is not central but located on the right side of the screen
so that the normal reading pattern of left to right (and overview to detail) in keeping
with G2.4 and G2.5). The interface has distinctive icons for additional functions in
borders around the window rather than overlaying on the windows, following G2.3
and G2.6. The use of tabs for the different dogs and icons for additional functions
supports the dynamic nature of the mission (G2.7), where the team leader may need
to look at specific dogs or take pictures and screen shots..
In terms of the Four C’s, each window clearly reflects one type of content and func-
tion (G3.1.1), either exocentric map-based overview or dog-centric view (G3.1.3).
The map window does have overlays for path and points of interest but these are
common for maps. The map window has standard features the users would expect
such as zoom (G3.1.4). The dog-centric window does not have overlays (G3.1.2).
There is the danger of visual capture (G3.1.5) from the dog-centric view but that is the
primary view. The windows have different viewpoints that are implicitly coordinated
based on the additions to the path. An earlier version tried to explicitly show where
the camera was pointing on the external view but that was inaccurate (the dog moves
a great deal) and cluttered the image. Thus G3.3.1 and G3.3.2 were not applicable.
Each tab for a dog in the dog-camera window had a unique color that was used to
reflect their path on the map, maintaining consistency (G3.3.4). Notice also that very
few colors were used (G3.4.1). Red and green were used for the dogs, as well as blue
because of favoring primary colors. While this violates G3.4.2, there was no “bad”
or “good” information on the display so this was not confusing. The path of the dog
was always yellow with red indicating barking; this overloads “red” as a color for a
dog violating G3.4.2 and misses an opportunity to semantically coordinate the dog
with its path (G3.3). Notice also that the tabs shift to color fill to make it clear which
dog’ s camera was being viewed (G3.4.3), and the colored border reinforces what
dog is being observed.
In terms of general interaction with, and through, the display, the interface was
deemed to fit the 1 min rule G4.1. It was immediately comprehensible because of
its layout and immediate access to expected functions (like taking a picture) with
one click icons versus multi-click drop down menus (G4.2). The interface supports
transitional users as a beginner can understand what is going on, while not exploiting
features such as weather, taking pictures, etc (G4.3). There is no opportunity to tailor
the display to personal preferences (G4.4), because the workflow is straightforward
and does not lend itself to customization.
11.8 High TRL Example: Construction Robot
Figure 11.8 shows the construction robot system. It consists of a mobile base that
carries two arms. The two arms can swivel and the movement is unlimited. In addition,
the construction robot has sensors which can be fused into a top-down synthetic image
and can deploy a tethered unmanned aerial vehicle. The goal was to design an End-
User interface. The construction robot provides a contrasting case study to that of
the Cyber K9 and a case study of designing for new technologies and adapting the
interface to a dynamically changing set of tasks.
Like the Cyber K9, the UGV was at a relatively high TRL. There were at least
three attributes that should make designing user interfaces easy. There was only one
End-User role, that of the robot operator. The End-User operator role was well-
established and situated in an existing work domain. The robot design team had
unrestricted access to a representative end-user who was embedded with the team.
In contrast, there were at least three attributes that made designing the interface
hard. One challenges that the interface must engage two robots, the primary UGV and
a secondary UAV tethered to the UGV. A second challenge is the complexity of the
robot system. The UGV has two dextrous arms and multiple sensor views, posing far
more data than the Cyber K9 plus requiring the operator to conduct work mediated
by the interface. The third, and perhaps most noteworthy attribute from practical
HRI interface design, is that the interface used two new interface technologies: novel
tactical interfaces for controlling the robot and a multi-display environment (MDE),
where multiple monitors were used.
Fig. 11.8 View of construction robot with arms and tethered unmanned aerial vehicle
Fig. 11.9 View of the multi-display environment (MDE) for the construction robot
Fig. 11.10 Views of the curved screen display for the construction robot and the tactical interface
11.8.1 General Interface Description
The construction robot began with a multi-display environment consisting of five

monitors (see Fig. 11.9) which was replaced with a single curved screen (see
Fig. 11.10). Regardless of display version, the interface supports control of a mobile
base, two arms with cameras, a thermal camera, a synthetic birds-eye camera view,
view from a UAV, three types of tactical robot controllers (joystick and Phantom hap-
tic interface, foot pedals), and vibration feedback from the arms to the operator. This
configuration, and especially its interface, extends beyond what has been published
or has been reported in the aircraft, nuclear, process safety, or underwater industries.
In general, not much is known about multiple display environments for control and
less for robotics. The most similar class of applications is aircraft cockpit design,
which has limited utility.
11.8.2 How It Illustrates the Guidelines
As described below, both interfaces highlight guidelines from Sect. 11.4. This section
is organized by the four categories.
In terms of roles, the interface is intended solely for a single operator who is
responsible for the UGV and its two arms (G1.1).
In terms of layout appropriateness, a multi-display environment (MDE) has spe-
cial rules, see [5, 6, 10–12, 16, 18, 28, 29, 31, 35, 37]. But the MDE does illustrate
G2.7, providing the operator the ability to change display content and location for
the tasks within the larger mission. The curved screen with multiple windows shows
the difficulty of designing for a previously unknown footprint (G2.1), but does allow
the primary window to be the largest and most captivating (G2.2).
In terms of the Four C’s, the interface shows attention to content and coordination.
The interface provided no comparisons and did not make use of color. In both the
MDE and curved screen versions, each monitor or window had a single information
purpose (G3.1.1). The curved screen with multiple windows allowed some windows
to overlap the primary window, perhaps to signify semantic correspondence (G2.4),
but at the risk of violating G2.3.
In terms of general interaction with, and through, the display, the 1-min rule (G4.1)
was reasonably satisfied, though not all functionality was, or could be, intuitive
without some sort of training. Due to the training and skill set required to operate the
robot (and the consequences of poor control), there is unlikely to be a need to design
for a transitional user (G4.3). The curved display was particularly sensitive to G4.4,
allowing the operator to chose the content, size, and placement of windows as well
as create layouts for different tasks that could be switched with one click.
11.9 Summary
In summary, designers may have to prepare three to four different interfaces over
the technological maturation process of a robot. The designer will typically create
a diagnostic interface implicitly targeting their role in programming and debugging.
That interface will generally be too complex and cluttered with details that do not help
the End-User quickly, and accurately, perform tasks. The interface may also introduce
extraneous cognitive work load by requiring the End-User to recall relationships,
abbreviations, etc., that could be replaced with icons, colors, and coordination that
allow the user to immediately recognize the current state of the robot and world; these
abstractions would be the heart of a good End-User execution interface. However,
many robots have two or more humans who are operating or consuming information
from the robot, so there may multiple versions of the End-User interface. One role
that is often ignored but is critical to funding and public acceptance of robotics is the
explicative interface for the Public. These interfaces are challenging because robot
interfaces have at least three major differences from human computer interfaces. To
help pro-actively design any of the three categories of interfaces, this chapter offers
thirty-two guidelines and gives two case studies of how they have been successfully
avoided common mistakes in layout, content, visualization of comparison, and use
of color.
(JST) Agency.
References
1. Burke, J., Murphy, R.: Human-robot interaction in USAR technical search: Two heads are
better than one. In: 13th IEEE International Workshop on Robot and Human Interactive Com-
munication (RO-MAN), Conference Proceedings, pp. 307–312
2. Burns, C.M., Hajdukiewicz, J.: Ecological Interface Design. CRC Press, Boca Raton (2004)
3. Card, S., Moran, T.P., Newell, A.: The Psychology of Human Computer Interaction. Lawrence
Erlbaum, Hillsdale (1983)
4. Casper, J.: Human-robot interactions during the robot-assisted urban search and rescue response
at the world trade center, Thesis (2002)
5. Chung, H., Chu, S.L., North, C.: A comparison of two display models for collaborative sense-
making. In: Proceedings of the 2nd ACM International Symposium on Pervasive Displays.
2491577: ACM, Conference Proceedings, pp. 37–42
6. Cockburn, A., Karlson, A., Bederson, B.B.: A review of overview+detail, zooming, and
focus+context interfaces. ACM Comput. Surv. 41(1), 1–31 (2009)
7. Cooper, J.L., Goodrich, M.A.: Towards combining UAV and sensor operator roles in UAV-
enabled visual search, pp. 351–358. 12–15 March 2008
8. Demir, I., Jarema, M., Westermann, D.: Visualizing the central tendency of ensembles of shapes,
pp. 1–8 (2016)
9. Grudin, J: Partitioning digital worlds: focal and peripheral awareness in multiple monitor use.
In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 365312:
ACM, Conference Proceedings, pp. 458–465, old paper but establishes focal periphery
10. Hutchings, D.R., Stasko, J., Czerwinski, M.: Distributed display environments. Interactions
12(6), 50–53 (2005) (seem interesting)
11. Hutchings, D.R., Stasko, J.: Consistency, multiple monitors, and multiple windows. In: Pro-
ceedings of the SIGCHI Conference on Human Factors in Computing Systems. 1240658: ACM,
Conference Proceedings, pp. 211–214
12. Ishak, E.W., Feiner, S.: Content-aware layout. In: CHI ’07 Extended Abstracts on Human
Factors in Computing Systems. 1241024: ACM, Conference Proceedings, pp. 2459–2464
13. Klein, G., Woods, D.D., Bradshaw, J.M., Hoffman, R.R., Feltovich, P.J.: Ten challenges for
making automation a “team player” in joint human-agent activity. IEEE Intell. Syst. 19(6),
91–95 (2004)
14. Klingenberg, C.: Visualizations in geometric morphometrics: How to read and how to make
graphs showing shape changes. Hystrix 24, 1–10 (2013)
15. Lee, W., Ryu, H., Yang, G., Kim, H., Park, Y., Bang, S.: Design guidelines for map-based
human-robot interfaces: a colocated workspace perspective. Int. J. Ind. Ergon. 37(7), 589–604
(2007)
16. Lischke, L., Mayer, S., Wolf, K., Henze, N., Reiterer, H., Schmidt, A.: Screen arrangements and
interaction areas for large display work places. In: Proceedings of the 5th ACM International
Symposium on Pervasive Displays, pp. 228–234
17. Malmstrom, C., Zhang, Y., Pasquier, P., Schiphorst, T., Bartram, L.: Mocomp: a tool for com-
parative visualization between takes of motion capture data, pp. 1–8 (2016)
18. Marrinan, T., Leigh, J., Renambot, L., Forbes, A., Jones, S., Johnson, A.E.: Mixed presence
collaboration using scalable visualizations in heterogeneous display spaces. In: Proceedings of
the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing.
2998346: ACM, Conference Proceedings, pp. 2236–2245
19. McKenney, M., Viswanadham, S.C., Littman, E.: The CMR model of moving regions, pp.
62–71 (2014)
20. Miller, G.: The magical number seven, plus or minus two: some limits on our capacity for
processing information. Psychol. Rev. 63(2), 81–97 (1956)
21. Murphy, R., Burke, J.: From remote tool to shared roles. IEEE Robot. Autom. Mag. (special
issue on New Vistas and Challenges for Teleoperation) 15(4), 39–49 (2008)
23. Murphy, R.R., Burke, J.L.: The safe human-robot ratio. In: Human-Robot Interactions in Future
Military Operations. Ashgate Publishing Company, Brookfield, VT (2009)
24. Murphy, R.R., Pratt, K.S., Burke, J.: Crew roles and operational protocols for rotary-wing
micro-UAVs in close urban environments. In: Proceedings of the 3rd ACM/IEEE Human-
Robot Interaction, pp. 73–80 (2008)
25. Nielsen, J.: Enhancing the explanatory power of usability heuristics, pp. 152–158 (1994)
26. Nielsen, J.: Powers of 10: time scales in user experience (2009)
27. Nielsen, C.W., Goodrich, i.A., Ricks, R.W.: Ecological interfaces for improving mobile robot
teleoperation. IEEE Trans. Robot. 23(5), 927–941 (2007)
28. Pattison, T., Phillips, M.: View coordination architecture for information visualisation. In:
Proceedings of the 2001 Asia-Pacific symposium on Information visualisation - Volume 9.
564061: Australian Computer Society, Inc., Conference Proceedings, pp. 165–169
29. Peer, S., Sharma, D.K., Ravindranath, K., Naidu, M.: Layout design of user interface compo-
nents with multiple objectives. Yugosl. J. Oper. Res. 14(2), 171–192 (2004)
30. Peschel, J.M., Murphy, R.R.: On the human-machine interaction of unmanned aerial system
mission specialists. IEEE Trans. Hum.-Mach. Syst. 43(1), 53–62 (2013)
31. Plumlee, M., Ware, C.: An evaluation of methods for linking 3d views. In: Proceedings of the
2003 Symposium on Interactive 3D Graphics. 641517: ACM, Conference Proceedings, pp.
193–201
32. Sears, A.L.: Layout appropriateness: guiding user interface design with simple task descrip-
tions. Thesis, University of Maryland, 1993, advisor - Ben Shneiderman
33. Shneiderman, B.: Designing the User Interface: Strategies for Effective Human. Computer
Interaction. Addison-Wesley (1998)
34. Tevs, A., Huang, Q., Wand, M., Seidel, H.-P., Guibas, L.: Relating shapes via geometric sym-
metries and regularities. ACM Trans. Graph. 33(4), 1–12 (2014)
35. Truemper, J.M., Sheng, H., Hilgers, M.G., Hall, R.H., Kalliny, M., Tandon, B.: Usability in
multiple monitor displays. SIGMIS Database 39(4), 74–89 (2008)
36. Vicente, K.J.: Cognitive Work Analysis: Toward Safe, Productive, and Healthy Computer-
Based Work. LEA Inc., Mahwah (1999)
37. Wang–Baldonado, M.Q., Woodruff, A., Kuchinsky, A.: Guidelines for using multiple views
in information visualization. In: Proceedings of the Working Conference on Advanced Visual
Interfaces. 345271: ACM, Conference Proceedings, pp. 110–119 (8 guidelines)
38. Wickens, C., Dixon, S., Chang, D.: Multiple resources and mental workload. Hum. Factors
50(3), 449–454 (2008)
Index
A Behavior estimation, 148

Accelerometers, 225 Bilateral control, 203, 232
Active Scope Camera (ASC), 15, 26 Bio-logging, 150
Actuator unit, 336, 344 Bird-eye view, 250
Acuity, 499 Bird’s eye view image, 204
Adaptability for various locomotion styles, Boom link, 229, 232
375 Bounding box, 164–166
Aerial robot, 5, 10 Bucket-mode, 441
Aerodynamic efficiency, 116 Business promotion, 484
Air jet, 31, 32
Amazon Web Services (AWS), 147, 178
Analog computing circuit, 305 C
Angular momentum, 125 Calibration, 500
Apparatus, 495 Calibration target, 256
Arbitrary Viewpoint Visualization System, Camera, 147, 149, 152, 177, 499
249, 250 Capabilities, 494
Arm, 123 Catenary curve, 215
Arm link, 229, 232 Center of gravity, 124
ASC thruster, 28 Center of pressure, 305
ASTM International, 494 CFD code, 111, 117
Autoencoder, 158–160 Checker pattern, 256, 257
Automation, 500 Choreonoid, 456
Autonomic nerve system, 168 Chum drill, 381
Autonomous Control Systems Laboratory Coaxial visible and LWIR camera system,
Ltd (ACSL), 18 258
6-axis haptic device, 236 Collision angle, 69, 70
Collision vibration, 69
Complementarity, 459
B Concentrated drive motors, 112
Backbone curve, 271 Constant power consumption, 113
Back-drivability, 406, 439 Construction machinery, 197
Background image, 370 Construction robot, 5, 10
Backtrack recognition, 157, 163–165, 167 Contact force, 320
Balance, 124 Contact-force computation, 457
Ball screw, 378 Contact plane, 458
Baseline measurement, 502 Contact state, 301
Bayesian signal processing, 46 Contex score, 163, 166
https://doi.org/10.1007/978-3-030-05321-5
530 Index
Contrast, 501 Emergency response, 4, 6

Controller Area Network (CAN), 336, 341 Emotion estimation, 148, 167, 174, 175
Convolution neural networks(CNNs), 158 End effector, 234, 338, 378
Coordinated manipulation, 238 Environmental modelling, 349
Corpo Nazionale Soccorso Alpino e Spele- Equipment inspection, 287
ologico (CNSAS), 21 Evolutionary computation, 358
Coulomb’s friction law, 459 Evolution strategy, 358
Council on Competitiveness-Nippon Experimental test bed at Kobe University,
(COCN), 4, 331, 332 229
Counter torque, 124 Extended Kalman Filter (EKF), 312
Crawler gait, 273 External force, 225
Crawling, 342, 347 Extreme infrared camera, 204
Crossing ramps, 289
Curvature, 272
Cutter Robot, 403 F
Cyber-enhanced Rescue Canine (CRC) suit, Feature extraction, 158, 164
145, 147, 148, 150–155, 176, 177, Field Evaluation Forum (FEF), 13, 229
185, 189 Field test, 155, 164, 165, 167, 182, 184, 189,
Cyber Rescue Canine, 5, 9 190
Finger control, 446
Fingertip force, 125
D Finite element method, 429
Damage prevention, 4, 7 Fire and Disaster Management Agency
Damage recovery, 4 (FDMA), 18
Debris field, 285 Firefighting, 39
Department of Homeland Security (DHS), Fisheye camera, 204, 250
494 Fixed fingers, 441
Dexterous manipulation, 377 Flange, 273
3D image, 14 Flexible Sensor Tube (FST) , 384
Direction-of-arrival, 100 Flow dividing valve, 443
Disaster response robot, 328 Foggy environment simulation system, 260
Discrete element method, 466, 469 Foggy scene, 249
Discriminability, 241 Force distribution, 305
Display of robot’s heading direction, 253 Force feedback, 224
Distance estimation, 120 Force plate, 230
Disturbance observer, 121 Forward recognition, 157, 163, 165
Dog’s action, 176–178, 181, 184 Four-limbed robot, 333, 348
Double-swing dual-arm mechanism, 199 FP7, 21
Dragon Firefighter, 15 Frenet–Serret, 275
Drone, 204 Fukushima-Daiichi Nuclear Power Plant ac-
Dual-arm construction robot, 229, 470 cident scenario, 8
Dual-mode functions, 440 Fukushima Robot Test Field; Fukushima
Ducted rotor, 116, 117 RTF, 13
Dustproof, 280 Fukushima Test Field, 236
Dynamics equation, 225
Dynamics simulation, 457
Dynamic time warping, 355 G
Generalized cross correlation method with
phase transform (GCC-PHAT), 310
E Global Navigation Satellite System (GNSS),
Earthquake disaster scenario, 7 147, 148, 151, 152, 177, 185, 213
Electrocardiogram (ECG), 169 Global Positioning System (GPS), 149
Electro-Hydrostatic Actuators (EHA), 433 Google Maps, 147, 155
Emergency management center, 14 Gradient tree boosting, 178
Index 531
Graphical User Interface (GUI), 154, 155 Information gathering processes, 269
Graphics engine, 455 Information visualization, 318
Grasping, 292, 377 Inspection, 504
Great Eastern Japan Earthquake, 4 Inspection and maintenance, 268
Great East Japan Earthquake, 485 Interior pipe wall, 318, 321
Gripper, 291 International safety standards, 488, 491
Inverse dynamics, 226
Inverse kinematics, 341
H ISO12100, 488, 489
H2SB, 433, 435, 437
Half-cylindrical screen, 207
Hand, 123 J
Hand camera, 206 Jack-up Robot, 403
Handler, 147, 149, 154, 176 Jacobian matrix, 225
Hand-mode, 441 Japan Rescue Dog Association (JRDA), 15,
Hanging a ladder, 130 145, 148, 150, 155
Haptic device, 208, 235 JIS standards, 334, 339
Haptic feedback, 239
HARK, 103
K
Heart beat interval, 168, 174
Kalman filter, 104
Heart Rate Variability (HRV), 168, 175
Heavy task, 377
Helical rolling motion, 274 L
Helical wave propagation motion, 276 Ladder, 273
Helipad, 212 Ladder climbing, 342, 346
High-frame-rate, 61 Laser beam, 187
High-frequency vibration, 66, 239 Laser Range Sensor Array (LRSA), 350
High-power-type snake robot, 279 Legged robot, 5, 9, 329, 333
High-speed camera, 185 Limited communication condition, 369
High-step, 288 Localization and mapping, 308
Human interface, 318 LWIR, 249
Hydraulic actuators, 223, 409
Hydraulic control system, 443
Hydraulic excavator, 197 M
Hydraulic hybrid, 432 Machine learning, 148, 168, 172–174, 176,
Hydraulic Hybrid Servo Booster, 433 177, 182
Hydraulic manipulator, 439 Machine-terrain interaction, 465
Hydraulic pressure sensors, 224 Map, 161, 165, 167
Hydraulic tough hand, 440 Master-slave, 208
Hydraulic WAREC, 409 Master-slave system, 384
McKibben artificial muscle, 410
Mechanism, 124, 291
I Membrane, 293
Image interpolation for camera malfunction, Metering, 433
253 Metric, 494
ImPACT-ASC, 27 Microphone array, 100
ImPACT-TRC, 147 MIL standards, 334, 339
Impulsive force, 231 Model task, 222
Induced velocity, 120 Multiple rotor drone, 111, 112, 117
Industrial disruptive innovation, 11 Multiple signal classification, 99
Industrial waste regulation, 486 Multi-resolution map, 362
Inertial Measurement Unit (IMU), 147, 148, Multi-Finger Hand, 470
152, 177, 178, 184, 186, 338, 341 Multi-label classification problem, 178
Information gathering and analysis, 6 Multi-monitor system, 205
532 Index
Multirotor micro unmanned aerial vehicle, Pipe map, 318, 320

212 Planar four-bar linkage mechanism, 378
Plant inspection scenario, 8
4-port swivel joint, 441
N Position estimation, 214
Narrow bandwidth condition, 372 Poster, 501
National Fire Protection Association Posture estimation, 49
(NFPA), 495 Powder clutch, 217
National Institute for Standards and Testing Power pack, 414, 417
(NIST), 494 Pressure conductive rubber, 305
National Research Institute for Earth Sci- Principal axes of inertia, 319
ence and Disaster Resilience (NIED), Proximity sensors, 286
18 Pseudo-inverse, 227
Newton-Euler method, 226
Noise subtraction, 240
Nonwheeled-type snake robot, 269, 271 Q
Normal force, 304 QR Code, 500
Normalized energy stability margin, 332
Northern Kyushu Heavy Rain disaster, 18
R
O Random forest, 172, 174
Occupancy grid mapping, 353 Real time operation, 158
Omnidirectional camera, 204 Recognition algorithm, 158, 159, 162, 166,
Omnidirectional movement, 274 167
Omni-Gripper, 16, 287 Recognition query, 161
On-board controller, 232 Reconnaissance, 505
On-Site Operations Coordination Center Recovery construction work, 7
(OSSOC), 14 Reduction of motors, 113
Operational requirements, 495 Redundancy, 335
Operational scenario, 495, 503 Reference model, 114
Optical window, 258 Reference Model Following Model Predic-
Optotype, 499 tive Control (RMFMPC), 114
Ortho-image, 14 Regularization technique, 460
Oscillating actuator, 413 Remote cockpit, 232
Oscillating torque actuator, 413, 416 Remote-controlled excavator, 220
Output/self-weight ratio, 403 Remote instruction, 149
Remotely controlled construction machines,
223
P Remote operation, 198
Parasympathetic nerve system, 168, 169, Repeatability, 495
171 Reproducibility, 495
Particle excitation valve, 427 Requirements, 494
Particle system, 472 Resistive force theory, 465
Past image records, 369 Resolution wedge, 500
Pectinate shape, 278 Response robots, 494
Peg-in-hole task, 210 Retention mechanism, 377
Perching, 131 Retroactive search, 148
PF-1, 18 rFlow3D, 117
Physics engine, 455 Risk assessment, 488, 493
Piezoelectric element, 427 Risk reduction, 488, 489, 491
Piezoelectric transducer, 427 Robot audition, 46, 99
Pilot check valve, 443 Robot hand, 376
Pipe inspection, 318 Rolling shutter, 56
Index 533
Root Mean Square of the Successive Differ- Target definition, 161, 165
ence (RMSSD), 168, 169, 172 Technical disruptive innovation, 10
Rotating fingers, 441 Technology catalogue, 13
Rotational speed control, 111 Technology cycle, 11
Runge–Kutta–Gill method, 463 Technology Readiness Level (TRL), 482
Tele-existence, 384
Telemanipulation, 383
S Teleoperation, 369
Sampling rate, 305 Telexistence, 201
Sandbag, 381 Terrain following, 288
Scale-gain adjustment, 387 Terramechanics, 466, 469
SDNN, 168, 169, 172 Tether powered MUAV, 212
Search And Rescue (SAR) dog, 145–147, Tether tension, 216
149, 150, 155, 176, 185, 189 Thin Serpentine Robot Platform, 26
Self-calibration, 56, 58 Third-person view, 369
Semiautonomous stair climbing, 289 Three-dimensional steering, 288
Separation principle, 488, 493 Three expected disaster robot functions, 4
Serial link manipulator, 225 Three-way value, 427, 428
Serpentine robot, 5, 9 Tilt link, 232
Servo valves, 443 Time-of-flight (ToF), 308
SHERPA, 21 Time-Stretched Pulse (TSP), 310
Simple shape, 272 Tohoku University, 13
SLAM, 349 Torque controllability, 439
Slider, 436 Torsion, 272
Smooth-type snake robot, 276 Tough, 4
Snake-like articulated mobile robot, 285 Tough snake robot systems, 268
Snake-like robot, 294 Trajectory estimation, 148, 185
Social disruptive innovation, 11 Trajectory tracking, 290
Social implementation, 482, 484 Triage, 146
Soft, 295 Tunnel Disaster Challenge, 21
Soil mechanics test bench, 467
Sound source localization, 375
Sound source mapping, 100 U
Spatial resolution, 302 Uncertainty, 495
Speech enhancement, 44, 45 Unmanned Aerial Vehicle (UAV), 155
Stabilized head camera image, 318, 320 Unmanned construction system, 198
Standardization, 494 Unrolled picture, 318, 321
Standard Test Method (STM), 494 Unsupervised learning, 158
State-space model, 311 User needs, 18
Steep staircase, 290
Stop principle, 488
Structure from motion, 58 V
Subjective evaluation, 244 Valve, 287
Survey, 505 Variable-inner-volume, 292
Switchboard, 299 Variable pitch control, 112
Sympathetic nerve system, 168, 169, 171 VDSL communication system, 220
Vehicle-like body frame, 318
Vibration sensor, 67, 241
T Vibrotactile display, 248
T2 Snake-3, 285 Vibrotactile feedback, 67, 239
Tactile feedback, 238 Victim Detection, 29
Tactile sensing, 65 Virtual Chassis, 318
Tactile sensor, 300 Virtual Fixture (VF) , 387
Tangential force, 306 Virtual marionette system, 375
534 Index
Vision-based predictive assist control (V- Waterproof, 280

PAC), 388 WebRTC, 153
Visual acuity, 494, 499 Wheeled snake robot, 269, 285
Visual SLAM, 55, 185 Wire-saving control system, 380
Visuo-haptic feedback, 66 World Robot Summit (WRS), 21, 473
Volumetric intersection, 457
W Z
Water jet, 39 Zero Velocity Point (ZVP), 186

Disaster Robotics: Satoshi Tadokoro Editor

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Disaster Robotics: Satoshi Tadokoro Editor

Uploaded by

Copyright:

Available Formats

Springer Tracts in Advanced Robotics 128

Satoshi Tadokoro Editor

Nancy Amato, Texas A&M University, USA

More information about this series at http://www.springer.com/series/5208

ISSN 1610-7438 ISSN 1610-742X (electronic)

Library of Congress Control Number: 2018964019

© Springer Nature Switzerland AG 2019

The ImPACT Tough Robotics Challenge (ImPACT-TRC) is a national project

Sendai, Japan Satoshi Tadokoro

Part I Introduction and Overview

Part II Disaster Response and Recovery

5 Dual-Arm Construction Robot with Remote-Control

Part III Preparedness for Disaster

Part IV Component Technologies

Part V Evaluation and Human Factors

Abstract The ImPACT Tough Robotics Challenge (ImPACT-TRC) is a national

© Springer Nature Switzerland AG 2019 3

1.1 Challenge of Disaster Robotics

1.2 Five Types of Robots

The Council on Competitiveness-Nippon (COCN) established the Disaster Robot

Table 1.1 Roadmaps drawn by the council on competitiveness-Nippon (COCN)

1.3 Use Scenario

Figure 1.2 shows a use scenario of robots.

1. Cyber Rescue Canine

Implementation: Use in debris and narrow complex parts of facilities for

Implementation: Development of new services by the tough aerial robots supe-

Implementation: Improvement of efficiency and safety of tasks of disas-

ImPACT-TRC is different from curiosity-driven fundamental research programs

1.4 Disruptive Innovations

1. Technical Disruptive Innovation

effectiveness are verified at Field Evaluation Forums using simulated disaster

The technology catalogue is published periodically. It shows the following infor-

1.5 Field Evaluation Forum

1. Researchers and Developers

Fig. 1.7 Field evaluation forum

• To understand other research results to extend systems through new research

1. Initial Information Gathering, Transportation of Emergency Goods

The above-mentioned research strategy of the ImPACT-TRC was in success for

1.6 Major Research Achievements

1. Cyber Rescue Canine

• Sound processing for hearing survivors’ voice in debris by removing noise.

These outcomes contribute to the resolution of the difficulties of users, as shown

1.7 Actual Use in Disasters

Acknowledgements The ImPACT Tough Robotics Challenge is conducted by a number of domes-

1. YouTube movie of FEF in (June 2018). https://youtu.be/EHub_hVVLj0

Masashi Konyo, Yuichi Ambe, Hikaru Nagano, Yu Yamauchi, Satoshi

M. Konyo (B) · Y. Ambe · H. Nagano · Y. Yamauchi · S. Tadokoro · T. Okatani

© Springer Nature Switzerland AG 2019 25

2.1 Overview of Thin Serpentine Robot Platform

2.1.2 Concept of ImPACT-TRC Thin Serpentine Robot

Active Scope Camera

Fig. 2.2 Floating active scope camera by air-jet

I. Air-jet Floating System for Hyper Mobility

Fig. 2.3 The robotic thruster Twisting mechanism Push/Pull mechanism

Tip Camera Image

Detection indicators for

Fig. 2.4 Operation interface of ASC

IV. Sensory Integrated Operation Interface

2.2 Advanced Mobility by Jet

Fig. 2.5 Concept of an Head part of continuum robot

Pitch Air tube

2.2.2 Related Research

2.2.3 Nozzle Design