Give Me a Hand – How Users Ask a Robotic Arm

for Help with Gestures
Mahisorn Wongphati and Yusuke Kanai
Graduate School of Science and Technology
Keio University, Yokohama, Japan
Email: {mahisorn, kana}
Hirotaka Osawa and Michita Imai
Faculty of Science and Technology
Keio University, Yokohama, Japan
Email: {osawa, michita}
Abstract—A task that requires two hands to perform such
as soldering usually needs additional tools for holding (e.g. a
cable) or adding (e.g. solder) an object to a specific position.
A robotic manipulator or robotic arm is one of the solution
for this requirement. When gesture is selected as a method for
controlling a robot, characteristics of gestures are needed for
designing and developing a gesture recognition system. With this
requirement, we conducted an experiment to obtain a set of user-
defined gestures in the soldering task to find out properties and
patterns of a gesture for future development of our research. 152
gestures were collected from 19 participants by presenting the
“effect” of the gesture (robotic arm movement), and then asking
the participants to perform its “cause” (a user-defined gesture).
The analyzed data shows that hands are the most used body parts
even they are occupied by the task, that one-hand and two-hands
gestures were used interchangeably by the participants, that the
majority of the participants performed reversible gestures for
reversible movements, and that the participants were expecting
for better recognition performance on an easier to plan gesture.
Our finding can be useful as a guideline for creating gesture
set and system for controlling robotic arms based on natural
behavior of users.
A great deal of research on robotic manipulator or robotic
arm usages and experiments has been conducted by many
human-robot interaction (HRI) research groups. Some ex-
amples are rehabilitation [1], household assistant [2], and
industrial assistant [3]. Normally, a robotic arm is controlled
by a controller and software that has been prepared for some
specific situations and environments. This common procedure
allows very sophisticated control over the robotic arm’s behav-
iors and movements [4]. However, it limits usage flexibility of
the robotic arm in unpredictable situations and environments
One of the challenge problems in robotic arm usage is a
robotic hand helper that works closely with the human [3],
[6], [7]. The challenge points include interaction timing [7]
and communication method [3]. Gesture has been selected by
many researches as a method for more flexible and natural
way to communicate and control the robot in unpredictable
situations [3], [8], [9], [10], [11], [12]. According to a related
research in human-computer interaction (HCI) [13], gestures
that were designed by system designers are usually based
on ease of detection and distinguishability [14], [15]. This
approach might not useful for predicting gestures that could
Fig. 1. Give me a hand. An image from the Internet that shows the need of
a hand helper in a difficult situation
be chosen by users.
In this paper, we focus on a specific situation when both
hands are occupied by a task and a hand helper is needed. This
constraint can be observed in many situations such as plastic or
wool model assembly (e.g. gluing two parts together), cooking
(e.g. need some salt while stir frying), electronic assembly
(e.g. assembling a connector), and etc. An example situation
when one more hand is really needed for adding solder to the
solder point is shown in Fig. 1
We prepared the experiment based on our on-going research
on a personal robotic workspace to find the characteristics of
the gestures when both hands are occupied by a task. This
experiment is based on the methodology for gesture study on
surface computing
from [13]. We prepared a personal robotic
workspace with a simulated robotic arm as a hand helper.
During the experiment, the participants will ask the robot to
add solder to a solder point by their own gestures.
By using think-aloud protocol, interview, questionnaire and
video analysis, we obtained qualitative data and results that
contribute the followings to HRI research community: (1)
characteristics of gestures for controlling a robotic arm, (2)
a set of carefully selected user-define gestures, and (3) the
understanding about gestures from a user with occupied hands.
This information can be used as a guideline for designing and
developing a method for operating and controlling a robotic
Blog -
e.g. Microsoft Surface -
978-1-4673-1421-3/12/$31.00 © 2012 IEEE
Proceedings of the 2012 IEEE International Conference on
Cyber Technology in Automation, Control and Intelligent Systems
May 27-31, 2012, Bangkok, Thailand
A. Robotic Arm Movements
Important questions about robotic arm movements are (1)
how many movements are needed? and (2) how long it will
take each participant to finish each experiment session?
To answer the questions, we conducted a pilot experiment
with lab members as suggested in [16]. The experiment used
six movements (forward, backward, left, right, up and down)
and two gripper motions (open and close) for controlling an
end effector of the robotic arm in 3D space.
Results from the pilot experiment suggested that the selected
movements (Fig. 2) trigger a wide range of gestures and
require a reasonable time for each experiment session.
B. Participants
19 volunteers participated in this study and 3 were females.
Ages of participants were asked by multiple-choice question-
naire and shown in Table I. One of the participants was
left-handed. 2 participants did not have soldering experience.
7 participants have some experiences with robots and one
of them has an experience with a robotic arm. Participants’
occupations include university student, company employee,
housewife, researcher, engineer, and retired person.
Under 18 18 - 30 30 - 45 46 - 55 Over 55
0 7 6 4 2
C. Experiment
The experiment was conducted on a table in the exhibition
booth (Fig. 3). We used the PowerPoint presentation for the
experiment process on a 24 inch screen in vertical mode
without a real robotic arm.
Each experiment session begins with an introduction and
purposes of the study. After the introduction, a randomized
sequence of 8 prerecorded robotic arm movements is shown to
the participant. After each movement, the presentation asks the
participant to think about a gesture that causes the movement.
The participant gives a signal when ready and then performs
the gesture. A conductor triggers the second animation after
the gesture is finished for simulating a recognition capability
of the system. After each gesture, the conductor makes a short
interview and asks the participant to answer three questions
about body part(s) that used in the gesture, ease of gesture
planning, and ease of the recognition of the performed gesture
from the participant’s point of view.
After finishing the session, the participant answered a set
of demography questions. One of the authors observed all
sessions and took notes about the gestures and conversations
for further analysis. All interactions were recorded by one
Keio Techno-Mall 2011 -
USB camera and two Kinect sensors. The conversations be-
tween and the conductor the participants were recorded by
a microphone on the USB camera. Depth information from
Kinect sensors was not used in this study.
19 participants made 152 gestures for 8 robotic arm move-
ments. No gesture was discarded due a confusion or experi-
ment error.
Fig. 3. The experiment setup on a desk with vertical monitor, two Kinect
sensors and one USB camera
From 152 gestures, participants used 7 body parts for con-
trolling the robotic arm as shown in Fig. 4. Hands are the most
used body part (57.8%). Upper body and head share 19.1%
and 13.2% respectively. Finger, mouth, arm and shoulder are
the rest 9.9%. In all hand gestures, the participants used one
hand (40%) or two hands (60%) for six movements (forward,
backward, left, right, up, and down). All hand gestures for
opening and closing the gripper are two-hands gestures.
Shoulder Arm Mouth Finger Head Body Hand
Fig. 4. The body parts used in all 152 gestures
106 of 152 gestures (69.7%) are 53 pairs of reversible
gestures (e.g. lean body forward and backward) for reversible
robotic arm movements (e.g. forward and backward).
Statistical analysis on the collected data shows that average
gesture planning time correlates significantly with average
“ease of planning” (r = −0.92) and “ease of planning” also
correlates significantly with “ease of recognition” (r = 0.9) as
(a) Forward (b) Backward (c) Up (d) Down
(e) Left (f) Right (g) Open (h) Close
Fig. 2. Simulated robotic arm movements used in the experiment. Each movement was captured at its start configuration.
Forward Backward Left Right Up Down Open Close
Planning Time (s) 10.86 9.13 8.29 8.90 10.29 8.41 10.21 14.48
Ease of Planning 5.83 5.85 6.11 5.81 5.94 6.44 5.41 4.33
Ease of Recognition 5.56 5.15 5.67 5.69 5.59 5.69 5.12 4.60
Fig. 5. Average planning time of all gestures (left vertical axis) and their corresponding qualitative scores (right vertical axis). Ease of planning and ease
recognition were rated on Likert scales. The first read, “The gesture I performed is easy to plan”. The second read, “The gesture I performed is easy to be
recognized by the robot”. Both scales are ranged from 1 = agree to 7 = disagree.
shown in Fig 5. This correlations suggest that the participants
expect a more reliable recognition on the easier to plan gesture.
When compared gesture direction and robotic arm move-
ment, gestures that direct robotic arm toward a desired direc-
tion were used by the participants in all movements. However,
there are exceptions in forward and backward movements
that half of the participants performed gestures that intimated
robotic arm movement such as lead body forward (Fig. 6a)
to make the robotic arm move forward (Fig. 2a). Information
from video analysis shows that the participants did not aware
about this difference in other gestures (left, right, up, down,
open, and close).
One-hand and two-hands gestures were used interchange-
ably by participants except for gripper open and close move-
ments. For example, in left (Fig. 2e) and right (Fig. 2f)
movements, participants used left hand, right hand or both
hands for gestures (see Fig. 6f and 6h). This suggests that
the recognition system should handle both one hand and two
hands gestures.
After synchronized, annotated, and analyzed all videos with
ELAN [17], we carefully selected and prepared a set of user-
defined gestures (Fig. 6). The selection is based on number
of the same gesture that is used by different users on the
same robotic arm movement. Forward, backward, left, right,
up and down gestures were selected from two most used
gestures in each movement. Gripper open and close gestures
are dominated by open two-hands (Fig. 6m) and close two-
hands (Fig. 6o) gestures. Fig. 6n and 6p are variations of the
open and close gestures .
(a) Forward: lean body forward (5 of 19) (b) Forward: move one or two hand(s) toward oneself (8 of 19)
(c) Backward: lean body backward (6 of 19) (d) Backward: move one or two hand(s) toward the robot (8 of 19)
(e) Left: twist/lean body and/or face left (8 of 19) (f) Left: move one or two hand(s) left (11 of 19)
(g) Right: twist/lean body and/or face right (8 of 19) (h) Right: move one or two hand(s) right (11 of 19)
(i) Up: move one or two hand(s) up (13 of 19) (j) Up: move head up (5 of 19)
(k) Down: move one or two hand(s) down (13 of 19) (l) Down: move head/body down (6 of 19)
(m) Open: open two hands (14 of 19) (n) Open: open wrists (variation of open two hands)
(o) Close: close two hands (14 of 19) (p) Close: close wrists (variation of close two hands)
Fig. 6. Set of user-defined gestures extracted from video sequences. The labels read,“movement: gesture (x of 19 participants use this gesture)”.
We show that gestures chosen by users when both hands are
occupied have the following characteristics: (1) hand gesture is
the most used gesture and it covers all robotic arm movements
in this study (Fig. 6b, 6d, 6f, 6h, 6i, 6k, 6m and 6o), (2) one
hand or two hands are used interchangeably, and (3) reversible
gestures are preferred when dealing with reversible robotic arm
We plan the following future works for our research goal:
(1) implement the gesture recognition system base on in-
formation from this work, (2) extend the experiment to a
real robotic arm with more tasks and movements for better
generalization and coverage, and (3) implement the personal
robotic workspace for unpredictable tasks inside home and
office environment.
This research is in part funded by the Keio Leading-
edge Laboratory of Science & Technology (KLL)’s Research
Grant for Ph.D. Program, 2011. The authors would like to
thank all participants for their time and insightful information.
Help and support from lab and lab members are gratefully
[1] J. Hammel, K. Hall, D. Lees, L. Leifer, M. V. der Loos, I. Perkash,
and R. Crigler, “Clinical evaluation of a desktop robotic assistant,” J.
Rehabil. Res. Dev., vol. 26, pp. 1–16, 1989.
[2] A. Jain and C. Kemp, “El-e: an assistive mobile manipulator that
autonomously fetches objects from flat surfaces,” Autonomous Robots,
vol. 28, pp. 45–64, 2010.
[3] T. Ende, S. Haddadin, S. Parusel, T. Wusthoff, M. Hassenzahl,
and A. Albu-Schaffer, “A human-centered approach to robot gesture
based communication within collaborative working processes,” in Proc.
IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, 2011, pp. 3367–
[4] J. J. Craig, Introduction to Robotics: Mechanics and Control, 3rd ed.
Prentice Hall, Aug. 2004.
[5] M. A. Goodrich and A. C. Schultz, “Human-robot interaction: a survey,”
Found. Trends Hum.-Comput. Interact., vol. 1, pp. 203–275, January
[6] J. Zhang and A. Knoll, “A two-arm situated artificial communicator
for human-robot cooperative assembly,” IEEE Trans. on Industrial
Electronics, vol. 50, no. 4, pp. 651–658, 2003.
[7] S. Glasauer, M. Huber, P. Basili, A. Knoll, and T. Brandt, “Interacting
in time and space: Investigating human-human and human-robot joint
action,” in Proc. of IEEE Int. Symp. on Robot and Human Interactive
Communication, 2010, pp. 252–257.
[8] R. Stiefelhagen, C. Fugen, R. Gieselmann, H. Holzapfel, K. Nickel, and
A. Waibel, “Natural human-robot interaction using speech, head pose
and gestures,” in Proc. IEEE/RSJ Int Conf on Intelligent Robots and
Systems, vol. 3, 2004, pp. 2422–2427.
[9] N. Kawarazaki, I. Hoya, K. Nishihara, and T. Yoshidome, “Cooperative
welfare robot system using hand gesture instructions,” in Advances in
Rehabilitation Robotics, ser. Lecture Notes in Control and Information
Sciences, Z. Bien and D. Stefanov, Eds. Springer Berlin / Heidelberg,
2004, vol. 306, pp. 143–153.
[10] C. L. Nehaniv, K. Dautenhahn, J. Kubacki, M. Haegele, C. Parlitz,
and R. Alami, “A methodological approach relating the classification of
gesture to identification of human intent in the context of human-robot
interaction,” in Proc. IEEE Int. Workshop Robot and Human Interactive
Communication, 2005, pp. 371–377.
[11] A. G. Brooks and C. Breazeal, “Working with robots and objects: revis-
iting deictic reference for achieving spatial common ground,” in Proc.
of the 1st ACM SIGCHI/SIGART Conf. on Human-robot interaction, ser.
HRI ’06. New York, NY, USA: ACM, 2006, pp. 297–304.
[12] T. P. Spexard, M. . Hanheide, and G. . Sagerer, “Human-oriented
interaction with an anthropomorphic robot,” IEEE Trans on Robotics,
vol. 23, no. 5, pp. 852–862, 2007.
[13] J. O. Wobbrock, M. R. Morris, and A. D. Wilson, “User-defined gestures
for surface computing,” in Proc. of the Int. Conf. on Human Factors in
Computing Systems. ACM, 2009, pp. 1083–1092.
[14] M. Nielsen, M. Strring, T. Moeslund, and E. Granum, “A procedure
for developing intuitive and ergonomic gesture interfaces for hci,” in
Gesture-Based Communication in Human-Computer Interaction, ser.
Lecture Notes in Computer Science, A. Camurri and G. Volpe, Eds.
Springer Berlin / Heidelberg, 2004, vol. 2915, pp. 105–106.
[15] L. Venetsky and J. W. Tieman, “Robotic gesture recognition system,”
U.S. Patent US 7,606,411 B2, 2009.
[16] J. Lazar, J. H. Feng, and H. Hochheiser, Research methods in human-
computer interaction. John Wiley & Sons, 2010.
[17] H. Sloetjes and P. Wittenburg, “Annotation by category - ELAN and
ISO DCR,” in Int. Conf. on Language Resources and Evaluation, 2008.