Professional Documents
Culture Documents
Nan Tian1,2 , Ajay Kummar Tanwani1 , Jinfa Chen2 , Mas Ma2 , Robert Zhang2 , Bill Huang2
Ken Goldberg1 and Somayeh Sojoudi1,3,4
IV. SELF-BALANCING ROBOT IGOR To keep the robot balancing, we assume that the lean angle
Igor is a 14 degrees of freedom (DoF), dual-arm, dual- ( ) is small so that the system can be linearized:
leg, dynamic self-balancing robot, made by HEBI robotics u sin( ) (4)
(shown in Fig. 3). Each DoF is built with a self-contained,
series elastic X-series servo modules. These servos can be torque (T ) in the direction of falling is applied to the wheel
controlled with position, velocity, and torque commands with radius (R) and angular velocity (!) to counteract the
simultaneously, and can provide accurate measurements of effects of gravity on the robot’s center of mass:
these three quantities to a central computer at high speed
(>1KHz) with minimum latency. These modules also act as T = R! = v ˙L (5)
Ethernet switch, and rely sensor informations and commands
where v is the velocity of the robot’s CoM. Furthermore,
to and from the native on-board Intel Nuc computer on the
the derivative of the lean angle ( ˙ ) can be controlled by a
robot.
proportional controller with coefficient (Kv ), and is related
The self-balancing is achieved by modeling the system as
to the velocity of the robot as follows:
an inverted pendulum (see Fig. 3 bottom). To estimate the
robot’s center of mass (CoM), the CoM of the two arms ˙ = Kv v (6)
and two legs are first measured in real-time using forward
kinematics via HEBI’s API. The position of the total CoM Real-time measurements of both robot CoM and direction
is then estimated as the average of the CoMs of the four of gravity are important, because the Igor controller needs to
extremities plus the CoM of the control box weighted by the compensate for dynamic movements of the four extremities
mass distribution: for robust self-balance control. We can also rotate the robot
P by applying different velocities to the two wheels. Inertia
mi x i measurement unit (IMU) readings can then be used to control
xCoM = Pi i = arms, legs, box (1)
i mi
robot turning angles in real-time.
Igor also uses accelerometer measurements to estimate the V. AN INTELLIGENT FOG ROBOTIC
direction of gravity (G) at all times. With CoM of Igor, center CONTROLLER
of wheels (ow ), and direction of gravity, we can calculate the A. Edge Controllers
length and direction of the inverted pendulum:
There are two edge controllers in our system. The first is
L = xCoM ow (2) the native Igor robot controller on the Intel Nuc computer. It
collects all sensor information from the 14 modular servos
The lean angle ( ), which is the angle between gravity and and controls them in real-time. It hosts a high-speed feedback
the inverted pendulum can then be estimated in real-time: control loop (200Hz or above) to maintain robot posture and
⇣ L·G ⌘ self-balancing. We refer to it as the low-level controller in
= cos 1 (3)
|L||G| Fig. 1.
lost during transmission. They can also arrive the designation
out-of-order. Both problems can cause unstable, unpredictiv
dynamic hybrid controls at the Edge, which can lead to bad
user experiences and human safety issues during teleopera-
tion.
We use a “heartbeat” design to solve these problems
(shown in Fig. 4). The “heartbeat” is a switch signal that
is turned on when the first control signal arrives at the Edge.
It will remain on for a period of time (t) and will only
turn off if there is no package received for the selected
command during this time. We can view the “heartbeat”
as performing a “convolution” with a moving window on
the signal received from RCU. Finally, we add a ramping
function at the beginning and the end of the “heartbeat”
signal to ensure smooth and stable start and stop transitions.
VI. DYNAMIC VISUAL SERVOING
To assist teleoperation with automation, we use Fog
Robotics IBVS to control Igor for automatic box pickups. We
choose IBVS because it eliminates time consuming camera
Fig. 3. Self-Balancing Robot Igor: (Top)14 Dof, dual-arm, dual-leg robot registration, which is hard to execute on a dynamic
robot built with series elastic servo modules. (Courtesy of Hebi Robotics) robotic system in an unstructured, human rich environment.
(Bottom) free body diagram of the inverted pendulum balancing control
The goal of the IBVS is to navigate the robot to an optimal
box pickup location where the apriltag lay exactly within
the green target box (Fig. 4, right). To do that, we need
The other edge controller is the robot command unit
to minimize the relative error between the measured target
(RCU). It is a smart android phone with a private LTE
position and desired target position e(t):
connection and a 2D camera (Fig. 1). RCU serves as the
gateway between the high-level cloud robotic platform and e(t) = s(m(t), a) s⇤ (7)
the low-level self-balancing controller. It uses the private LTE
where m(t) is a set of image measurements and a is a
connection to stream live videos to the cloud. It receives and
set of parameters, such as camera intrinsics, that represents
forwards high-level intelligent controls from the cloud to the
additional knowledge about the system. s is the measured
low-level controller with minimum delay. RCU works both
values of image features/object locations, such as pixel
indoors and outdoors with a good LTE reception.
coordinates in the picture frame, and s⇤ is the desired values
B. Cloud Controller of image features/object locations.
The change of feature error ė and camera velocity vc is
A high-level intelligent robot controller is placed in the
related by interactive matrix L:
cloud to work with the edge controller RCU (see Fig. 1). It
operates at a lower speed (3-5Hz), yet it commands the robot
ė = Lvc (8)
based on HARI’s Artificial Intelligence (AI) and Human
Intelligence (HI) services, which is critical for robots to For IBVS, which is done in 2D image space, 3D points
operate under unstructured environments. Depending on the X = (X, Y, Z) are projected onto 2-D images with coordi-
situation, it can either extract commands based on the object nates x = (x, y):
recognition server or forward commands sent from a cloud
x = X/Z (9)
teleoperator. These high-level commands are sent to RCU.
They are then forwarded, transformed, and executed in the y = Y /Z (10)
form of dynamic commands on the low-level controller. which creates an interactive matrix for 2D image based
C. Hybrid Control with “Heartbeat” servoing:
2 3
When controlling Igor, commands sent from the high- 1/Z 0 x/Z xy (1 + x2 ) y
level cloud controller act as perturbations to a time-invariant, L=4 5
0 1/Z y/Z 1 + y 2 xy x
stable system maintained by the low-level self-balancing
controller at the edge. This is a form of hybrid control With the interactive matrix, camera velocity can be estimated
where discrete signals are sent from the cloud to control by:
a dynamical system at the edge. vc = L+ e = L+ (s s⇤ ) (11)
To minimize network delays, we choose to use UDP, an
where L+ is the Moore-Penrose pseudo-inverse of L:
asynchronous network protocol, to implement the Cloud-
Edge communication. However, with UDP, packages can be L+ = (LT L) 1
LT (12)
Fig. 4. (Left) “Heartbeat” Asynchronous Protocol: (Green) network delays from cloud teleoperator to robot Edge controller; (Purple) sliding
windows that turn ON the “heartbeat”; (Red) sliding window that turn OFF the “heartbeat”; (Right) Image Based Visual Servoing (IBVS): Igor
automatically picks up a Box using Fog Robotic IBVS. Videos: (1) Pickup from a table: https://youtu.be/b0mr5GHHjBg (2) Pickup from a
person: https://youtu.be/K9R4y2w1uPw
The final control law is set as a robot velocity effort vs VII. EXPERIMENTS AND RESULTS
opposite to the camera velocity vc because the target moves With the “heartbeat” design, we can teleoperate the self-
in the opposite direction of the camera in the image frame: balancing robot reliably through the Cloud. To teleoperate the
vs = vc (13) robot for box pickups, we hardcode a box pickup motion with
the two robot arms. We first attempt to pick up a box with a
Notice that the interactive matrix depends only on x and joystick via direct cloud teleoperation. However, even with
y, that is the 2D pixel coordinate of the target, and Z a reliable teleoperation module and pre-programmed pickup
which is the depth of the target. In our system, Z is motion, we find it extremely difficult to finish the task using
measured as the size of the apriltag. Therefore, the IBVS cloud teleoperation. We suspect that natural, immersive 3D
measurement is independent of the exact 3D position of visual perception is critical for a human to pickup objects
the target measurement. This is a attractive feature for our efficiently, but a human teleoperator loses these perceptions
system design because the exact 3D camera registration is using our current cloud teleoperation interface. In another
not required for IBVS to complete the box pickups. word, our cloud teleoperation interface is not intuitive for
efficient object pickups.
A. IBVS Implementations for Automatic Box Pickup
TABLE I
The automatic IBVS controller performs a box pickup in T ELEOPERATION VS . AUTOMATIC B OX P ICKUPS
three phases. In phase one, the controller move the robot to
a position where the aprialtag has the same size as the green Average Duration (s) Success Rate
box (Fig 4, right). The robot also need to position apriltag on
Local Teleop 43 9/10
the center purple line of the video frame after phase one, but
not exactly at the center due to height differences. In phase Cloud Teleop 340 4/10
two, the robot adjusts its own height by changing the joint Auto Pickups 46 10/10
angles of the two “knee” joints so that the apriltag would
lay at the center of the video frame where the green box is. To quantify the observation, we perform two different
After the robot reaches the optimal picking position when teleoperation experiments with 10 trials each: (1) control
apriltag is at the center of the video, phase three begins. Igor locally so that the operator is close-by and can see
The robot controller commits to perform a box-pickup with the robot and the box; (2) control Igor remotely from the
a pre-defined, hard-coded, dual arm, grasping motion. cloud to pickup the box through the teleoperation interface
(Fig. 4, right). In both cases, the box is positioned at
B. Fog Robotic IBVS the center of the table. The robot is 2 meters from the
Although we use simple apriltags for object recognition, object and faces the front of the box (see setups in video
we aim to anticipate the design of a deep-learning-based https://youtu.be/b0mr5GHHjBg). We observe that
Fog Robotic visual system for robotic pickups. Therefore, local teleoperator can perform box pickups much faster with
to emulate the latency effects under such system, we deploy a higher success rate than the cloud teleoperator (see table
apriltag recognition in the cloud and use “heartbeat” pro- I)
tocol to stream apriltag’s geometric informations–size and To assist the cloud teleoperator for improved intuition and
location–to low-level controller via RCU. Together, we build efficiency, we implement the automatic IBVS module. We
a robust, closed-loop Fog Robotic IBVS controller for box benchmark the same experiments using the automatic module
pickups. with human in the loop. In these experiments, we allow the
VIII. DISCUSSION AND FUTURE WORKS
In this work, we combine intelligent Cloud platforms
and fast Edge devices to build a Cloud-Edge hybrid Fog
Robotic IBVS system that can perform dynamic, human-
compliant, automatic box pickups. A “heartbeat” protocol
with asynchronous communication is implemented under this
framework to mitigate network latencies and variabilities.
However, the current “heartbeat” protocol is not perfect. The
“heartbeat” actually increases delay at the end of the action,
which would cause further delayed reaction after the last
command signal. This is a significant safety concern, because
even if the “heartbeat” time window is short, i.e. 250 ms in
our case, the robot will continue to move until after the 250
Fig. 5. Igor Box-Pickup Reliability Tests: (A) Top down view: four ms window. To compensate, we implement a sharper ramp
starting positions in respect to the standing desk; (B) Front view: The
position of the box is generated randomly from two uniformly distributed function at the end-of-action, but 250 ms is the hard limit for
variables, horizontal coordinate X, in range (-1, 1) meters, and vertical this kind of delay with the current “heartbeat” system (red
coordinate Y, in range (0.8, 1.4) meters. arrow in Fig. 4).
In the future, we can further reduce such delay by mod-
eling variabilities of time intervals between package arrivals
cloud teleoperator to first teleoperate the robot as fast as due to package loss. This way, the Edge controller can predict
possible to a location where apriltag is recognizable (up to when the last package would arrive, so that it can finish the
20 degrees from the surface normal of the apriltag) and is action before the arrival of the last command.
about 2 meters away. The teleoperator then switch on the Fog Like other service robots, Igor needs to interact and coop-
Robotic IBVS module, allowing the robot to pick up the box erate with human beings. We demonstrate the advantages
automatically. We observe that the speed of this approach is of visual servoing: (1) it requires no calibrations before
on-par with human local teleoperation, but the reliability is each robotic task; (2) it can handle dynamic human robot
even higher at 100% (see table I) interaction, such as following a human to pick up a box from
that person. One failure case is when the human carrier tricks
the robot. It happens if the human carrier move the box after
TABLE II
the robot commits to the final phase of box picking, which
S UCCESSFUL AUTOMATIC P ICKUPS OF A R ANDOMLY P OSITIONED B OX
is hard-coded. Instead, we can automate the pickup motion
Start Location Point I Point II Point III Point IV
based on visual servoing as well, and we will leave it as
future work.
Success Rate 8/10 9/10 9/10 10/10
For Fog Robotic IBVS, we emulate a cloud-based deep-
learning visual perception system by deploying apriltag
recognition in the cloud. As part of future work, we plan
We further conduct 40 reliability trials. At each trial, Igor
to deploy deep-learning recognition pipelines such as mask-
start from one of four starting locations–Point I-IV (Fig. 5,
RCNN [40] together with intelligent grasping systems such
A) to pickup a box that is put at a random position on a
as dex-net [27] [7] [19], so that it can guide both dynamic
standing desk. We perform 10 trials for each robot starting
robots such as Igor and static robots such as YuMi [18] and
location, and the box position is defined by two uniformly
HSR [41] to perform generalized, human compliant object
distributed random variables–horizontal X coordinate in the
pickups and manipulations.
range of -1 and 1 meters and vertical Y coordinates between
0.8 and 1.4 meters (Fig. 5, B). We show high reliabilities of IX. ACKNOWLEDGMENTS
the automatic, IBVS box-pickup system through these trials We thank Ron Berenstein from the AUTOLAB for vi-
in table II sual servoing related discussions, Matthew Tesch, David
Finally, we use the IBVS module assist a cloud teleopera- Rollinson, Curtis Layton, and Prof. Howie Choset from
tor to pickup a box efficiently from a box carrier. During HEBI robotics for continuous support, training, and discus-
this task, a moving human carrier is holding a box with sions on the self-balancing robot. We thank Arvin Zhang,
an apriltag. A cloud teleoperator operates Igor to a location Havelet Zhang, Mel Zhao from Cloudminds for technical
where the apriltag can be recognized. As soon as the robot support. Special thanks to Prof. Joseph Gonzalez for discus-
recognizes it, the teleoperator would swtich the robot into sions on hybrid synchronous and asynchronous Systems.
IBVS mode so that the robot can start automatic box pickups. Funding from Office of Naval Research, NSF EPCN, and
We observe that as long as the robot can track the apriltag, it Cloudminds Inc. are acknowledged. Any opinions, findings,
will dynamically follow the human around; and if the human and conclusions or recommendations expressed in this ma-
carrier stops and holds the box at a static reachable position, terial are those of the authors and do not necessarily reflect
the robot will eventually pick up the box from the carrier. the views of the Sponsors.
R EFERENCES object recognition and grasp planning in surface decluttering,” in IEEE
International Conference on Robotics and Automation (ICRA), 2019.
[1] I. F. of Robotics, Introduction into Service Robots, 2016 (accessed [22] F. Bonomi, R. Milito, J. Zhu, and S. Addepalli, “Fog computing and
September 15, 2018). [Online]. Available: https://ifr.org/img/office/ its role in the internet of things,” in Proceedings of the first edition
Service Robots 2016 Chapter 1 2.pdf of the MCC workshop on Mobile cloud computing. ACM, 2012, pp.
[2] ——, Executive Summary World Robotics 2017 Service Robots, 2017 13–16.
(accessed September 15, 2018). [Online]. Available: https://ifr.org/ [24] S. Gudi, S. Ojha, B. Johnston, J. Clark, and M.-A. Williams, “Fog
downloads/press/Executive Summary WR Service Robots 2017.pdf robotics: An introduction,” in IEEE/RSJ International Conference on
[3] R. Girshick, “Fast r-cnn,” in Proceedings of the IEEE international Intelligent Robots and Systems, 2017.
conference on computer vision, 2015, pp. 1440–1448. [25] A. Kattepur, H. K. Rath, and A. Simha, “A-priori estimation of com-
[4] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification putation times in fog networked robotics,” in 2017 IEEE International
with deep convolutional neural networks,” in Advances in neural Conference on Edge Computing (EDGE). IEEE, 2017, pp. 9–16.
information processing systems, 2012, pp. 1097–1105.
[26] S. Chinchali, A. Sharma, J. Harrison, A. Elhafsi, D. Kang, E. Perga-
[5] W. Abdulla, “Mask r-cnn for object detection and instance segmen-
ment, E. Cidon, S. Katti, and M. Pavone, “Network offloading poli-
tation on keras and tensorflow,” https://github.com/matterport/Mask
cies for cloud robotics: a learning-based approach,” arXiv preprint
RCNN, 2017.
arXiv:1902.05703, 2019.
[6] Z. Cao, T. Simon, S.-E. Wei, and Y. Sheikh, “Realtime multi-person
[27] J. Mahler, F. T. Pokorny, B. Hou, M. Roderick, M. Laskey, M. Aubry,
2d pose estimation using part affinity fields,” in CVPR, 2017.
K. Kohlhoff, T. Kroger, J. Kuffner, and K. Goldberg, “Dex-net 1.0:
[7] J. Mahler, J. Liang, S. Niyaz, M. Laskey, R. Doan, X. Liu, J. A. Ojea,
A cloud-based network of 3d objects for robust grasp planning using
and K. Goldberg, “Dex-net 2.0: Deep learning to plan robust grasps
a multi-armed bandit model with correlated rewards,” in Proc. IEEE
with synthetic point clouds and analytic grasp metrics,” arXiv preprint
Int. Conference on Robotics and Automation (ICRA), 2016.
arXiv:1703.09312, 2017.
[28] J. Tobin, R. Fong, A. Ray, J. Schneider, W. Zaremba, and P. Abbeel,
[8] S. Levine and P. Abbeel, “Learning neural network policies with
“Domain randomization for transferring deep neural networks from
guided policy search under unknown dynamics,” in Advances in Neural
simulation to the real world,” in Intelligent Robots and Systems (IROS),
Information Processing Systems, 2014, pp. 1071–1079.
2017 IEEE/RSJ International Conference on. IEEE, 2017, pp. 23–30.
[9] S. Levine, P. Pastor, A. Krizhevsky, and D. Quillen, “Learning hand-
eye coordination for robotic grasping with deep learning and large- [29] Z. Cao, G. Hidalgo, T. Simon, S.-E. Wei, and Y. Sheikh, “OpenPose:
scale data collection,” arXiv preprint arXiv:1603.02199, 2016. realtime multi-person 2D pose estimation using Part Affinity Fields,”
[10] C. Finn and S. Levine, “Deep visual foresight for planning robot in arXiv preprint arXiv:1812.08008, 2018.
motion,” in Robotics and Automation (ICRA), 2017 IEEE International [30] W. Zhang, M. S. Branicky, and S. M. Phillips, “Stability of networked
Conference on. IEEE, 2017, pp. 2786–2793. control systems,” IEEE Control Systems, vol. 21, no. 1, pp. 84–99,
[11] A. X. Lee, S. Levine, and P. Abbeel, “Learning visual servoing with 2001.
deep features and fitted q-iteration,” arXiv preprint arXiv:1703.11000, [31] H. Wu, L. Lou, C.-C. Chen, S. Hirche, and K. Kuhnlenz, “Cloud-based
2017. networked visual servo control,” IEEE Transactions on Industrial
[12] N. Tian, B. Kuo, X. Ren, M. Yu, R. Zhang, B. Huang, K. Goldberg, Electronics, vol. 60, no. 2, pp. 554–566, 2013.
and S. Sojoudi, “A cloud-based robust semaphore mirroring system [32] H. Wang, Y. Tian, and N. Christov, “Event-triggered observer based
for social robots,” learning, vol. 12, p. 14. control of networked visual servoing control systems,” Journal of
[13] B. Kehoe, S. Patil, P. Abbeel, and K. Goldberg, “A survey of research Control Engineering and Applied Informatics, vol. 16, no. 1, pp. 22–
on cloud robotics and automation,” IEEE Transactions on Automation 30, 2014.
Science and Engineering, vol. 12, no. 2, pp. 398–409, 2015. [33] A. Birk, T. Doernbach, C. Müller, T. Luczynski, A. G. Chavez,
[14] J. J. Kuffner et al., “Cloud-enabled robots,” in IEEE-RAS international D. Köhntopp, A. Kupcsik, S. Calinon, A. Tanwani, G. Antonelli et al.,
conference on humanoid robotics, Nashville, TN, 2010. “Dexterous underwater manipulation from distant onshore locations,”
[15] G. Mohanarajah, D. Hunziker, R. D’Andrea, and M. Waibel, “Rapyuta: IEEE Robotics and Automation Magazine.
A cloud robotics platform,” IEEE Transactions on Automation Science [34] I. Havoutis and S. Calinon, “Learning assistive teleoperation behaviors
and Engineering, vol. 12, no. 2, pp. 481–493, 2015. from demonstration,” in Safety, Security, and Rescue Robotics (SSRR),
[16] A. Vick, V. Vonásek, R. Pěnička, and J. Krüger, “Robot control 2016 IEEE International Symposium on. IEEE, 2016, pp. 258–263.
as a service—towards cloud-based motion planning and control for [35] P. J. Besl and N. D. McKay, “Method for registration of 3-d shapes,” in
industrial robots,” in Robot Motion and Control (RoMoCo), 2015 10th Sensor Fusion IV: Control Paradigms and Data Structures, vol. 1611.
International Workshop on. IEEE, 2015, pp. 33–39. International Society for Optics and Photonics, 1992, pp. 586–607.
[17] J. P. Jeffrey Ichnowski and R. Alterovitz, “Cloud based motion plan [36] R. Tsai, “A versatile camera calibration technique for high-accuracy 3d
computation for power constrained robots,” in 2016 Workshop on the machine vision metrology using off-the-shelf tv cameras and lenses,”
Algorithmic Foundations of Robotics. WAFR, 2016, . IEEE Journal on Robotics and Automation, vol. 3, no. 4, pp. 323–344,
[18] N. Tian, M. Matl, J. Mahler, Y. X. Zhou, S. Staszak, C. Correa, 1987.
S. Zheng, Q. Li, R. Zhang, and K. Goldberg, “A cloud robot system [37] S. Hutchinson, G. D. Hager, and P. I. Corke, “A tutorial on visual servo
using the dexterity network and berkeley robotics and automation as control,” IEEE transactions on robotics and automation, vol. 12, no. 5,
a service (brass),” in Robotics and Automation (ICRA), 2017 IEEE pp. 651–670, 1996.
International Conference on. IEEE, 2017, pp. 1615–1622. [38] F. Chaumette and S. Hutchinson, “Visual servo control. i. basic
[19] P. Li, B. DeRose, J. Mahler, J. A. Ojea, A. K. Tanwani, and K. Gold- approaches,” IEEE Robotics & Automation Magazine, vol. 13, no. 4,
berg, “Dex-net as a service (dnaas): A cloud-based robust robot grasp pp. 82–90, 2006.
planning system.” [39] A. Shademan, A.-M. Farahmand, and M. Jägersand, “Robust jacobian
[20] A. Manzi, L. Fiorini, R. Esposito, M. Bonaccorsi, I. Mannari, P. Dario, estimation for uncalibrated visual servoing,” in Robotics and Automa-
and F. Cavallo, “Design of a cloud robotic system to support senior tion (ICRA), 2010 IEEE International Conference on. IEEE, 2010,
citizens: the kubo experience,” Autonomous Robots, pp. 1–11, 2016. pp. 5564–5569.
[21] A. K. Tanwani, N. Mor, J. Kubiatowicz, J. Gonzalez, and K. Goldberg, [40] K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask r-cnn,” in
“A fog robotics approach to deep robot learning: Application to Proceedings of the IEEE international conference on computer vision,
[23] C. Mouradian, D. Naboulsi, S. Yangui, R. H. Glitho, M. J. Morrow, 2017, pp. 2961–2969.
and P. A. Polakos, “A comprehensive survey on fog computing: State- [41] M. Laskey, J. Lee, R. Fox, A. Dragan, and K. Goldberg, “Dart:
of-the-art and research challenges,” IEEE Communications Surveys & Noise injection for robust imitation learning,” arXiv preprint
Tutorials, vol. 20, no. 1, pp. 416–464, 2017. arXiv:1703.09327, 2017.