Professional Documents
Culture Documents
Intelligent Autonomous Robotics A Robot Soccer Case Study Synthesis Lectures On Artificial Intelligence and Machine Learning PDF
Intelligent Autonomous Robotics A Robot Soccer Case Study Synthesis Lectures On Artificial Intelligence and Machine Learning PDF
i
MOBK082-FM MOBK082-Stones.cls June 29, 2007 9:7
ii
MOBK082-FM MOBK082-Stones.cls June 29, 2007 9:7
iii
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in
any form or by any means—electronic, mechanical, photocopy, recording, or any other except for brief quotations
in printed reviews, without the prior permission of the publisher.
DOI: 10.2200/S00090ED1V01Y200705AIM001
First Edition
10 9 8 7 6 5 4 3 2 1
iv
MOBK082-FM MOBK082-Stones.cls June 29, 2007 9:7
M
&C Morgan &Claypool Publishers
v
MOBK082-FM MOBK082-Stones.cls June 29, 2007 9:7
vi
ABSTRACT
Robotics technology has recently advanced to the point of being widely accessible for relatively
low-budget research, as well as for graduate, undergraduate, and even secondary and primary
school education. This lecture provides an example of how to productively use a cutting-edge
advanced robotics platform for education and research by providing a detailed case study with
the Sony AIBO robot, a vision-based legged robot. The case study used for this lecture is the
UT Austin Villa RoboCup Four-Legged Team. This lecture describes both the development
process and the technical details of its end result. The main contributions of this lecture are
(i) a roadmap for new classes and research groups interested in intelligent autonomous robotics
who are starting from scratch with a new robot, and (ii) documentation of the algorithms
behind our own approach on the AIBOs with the goal of making them accessible for use on
other vision-based and/or legged robot platforms.
KEYWORDS
Autonomous robots, Legged robots, Multi-Robot Systems, Educational robotics, Robot soccer,
RoboCup
ACKNOWLEDGMENT
This lecture is based on the work and writing of many people, all from the UT Austin Villa
RoboCup team. A significant amount of material is from our team technical report written
after the 2003 RoboCup competition and written in collaboration with Kurt Dresner, Selim T.
Erdo Erdoğan, Peggy Fidelman, Nicholas K. Jong, Nate Kohl, Gregory Kuhlmann, Ellie Lin,
Mohan Sridharan, Daniel Stronger, and Gurushyam Hariharan [78]. Some material from the
team’s 2004, 2005, and 2006 technical reports, co-authored with a subset of the above people
plus Tekin Meriçli and Shao-en Yu, is also included [79, 80, 81]. This research is supported
in part by NSF CAREER award IIS-0237699, ONR YIP award N00014-04-1-0545, and
DARPA grant HR0011-04-1-0035.
MOBK082-FM MOBK082-Stones.cls June 29, 2007 9:7
vii
Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2. The Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3. Initial Behaviors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7
4. Vision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4.1 Camera Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
4.2 Color Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.3 Region Building and Merging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.4 Object Recognition with Bounding Boxes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.5 Position and Bearing of Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.6 Visual Opponent Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5. Movement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
5.1 Walking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
5.1.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
5.1.2 Forward Kinematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5.1.3 Inverse Kinematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
5.1.4 General Walking Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.1.5 Omnidirectional Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.1.6 Tilting the Body Forward. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35
5.1.7 Tuning the Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.1.8 Odometry Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.2 General Movement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.2.1 Movement Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.2.2 Movement Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.2.3 High-Level Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
5.3 Learning Movement Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.3.1 Forward Gait . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.3.2 Ball Acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
8. Localization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
8.1 Background. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .54
8.1.1 Basic Monte Carlo Localization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
8.1.2 MCL for Vision-Based Legged Robots . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
8.2 Enhancements to the Basic Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
8.2.1 Landmark Histories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
8.2.2 Distance-Based Updates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
8.2.3 Extended Motion Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
8.3 Experimental Setup and Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
8.3.1 Simulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
8.3.2 Experimental Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
8.3.3 Test for Accuracy and Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
8.3.4 Test for Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
8.3.5 Extended Motion Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
8.3.6 Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
8.4 Localization Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
9. Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
9.1 Initial Robot-to-Robot Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
9.2 Message Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
9.3 Knowing Which Robots Are Communicating . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
9.4 Determining When A Teammate Is “Dead” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
9.5 Practical Results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .71
12. Behaviors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
12.1 Goal Scoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
12.1.1 Initial Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
MOBK082-FM MOBK082-Stones.cls June 29, 2007 9:7
CONTENTS ix
12.1.2 Incorporating Localization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
12.1.3 A Finite State Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
12.2 Goalie . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
13. Coordination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
13.1 Dibs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
13.1.1 Relevant Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
13.1.2 Thrashing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
13.1.3 Stabilization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
13.1.4 Taking the Average . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
13.1.5 Aging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
13.1.6 Calling the Ball . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
13.1.7 Support Distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
13.1.8 Phasing out Dibs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
13.2 Final Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
13.2.1 Roles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
13.2.2 Supporter Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
13.2.3 Defender Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
13.2.4 Dynamic Role Assignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
14. Simulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
14.1 Basic Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
14.2 Server Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
14.3 Sensor Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
14.4 Motion Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
14.5 Graphical Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
15. UT Assist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
15.1 General Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
15.2 Debugging Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
15.2.1 Visual Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
15.2.2 Localization Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
15.2.3 Miscellaneous Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
15.3 Vision Calibration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .102
B. Kicks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
B.1 Initial Kick . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
B.2 Head Kick . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
B.3 Chest-Push Kick . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
B.4 Arms Together Kick . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
B.5 Fall-Forward Kick . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
B.6 Back Kick . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
C. TCPGateway . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
Biography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
robotics Mobk082 July 9, 2007 5:34
CHAPTER 1
Introduction
Robotics technology has recently advanced to the point of being widely accessible for relatively
low-budget research, as well as for graduate, undergraduate, and even secondary and primary
school education. However, for most interesting robot platforms, there remains a substantial
learning curve or “ramp-up cost” to learning enough about the robot to be able to use it
effectively. This learning curve cannot be easily eliminated with published curricula or how-
to guides, both because the robots tend to be fairly complex and idiosyncratic, and, more
importantly, because robot technology is advancing rapidly, often making previous years’ models
obsolete as quickly as competent educational guides can be created.
This lecture provides an example of how to productively use a cutting-edge advanced
robotics platform for education and research by providing a detailed case study with the Sony
AIBO robot. Because the AIBO is (i) a legged robot with primarily (ii) vision-based sensing,
some of the material will be particularly appropriate for robots with similar properties, both of
which are becoming increasingly prevalent. However, more generally, the lecture will focus on
the steps required to start with a new robot “out of the box” and to quickly use it for education
and research.
The case study used for this lecture is the UT Austin Villa RoboCup Four-Legged
Team. In 2003, UT Austin Villa was a new entry in the ongoing series of RoboCup legged
league competitions. The team development began in mid-January of 2003, at which time
none of the team members had any familiarity with the AIBOs. Without using any RoboCup-
related code from other teams, we entered a team in the American Open competition at
the end of April, and met with some success at the annual RoboCup competition that took
place in Padova, Italy, at the beginning of July. By 2004, the team became one of the top teams
internationally, and started generating a series of research articles in competitive conferences and
journals.
RoboCup, or the Robot Soccer World Cup, is an international research initiative designed
to advance the fields of robotics and artificial intelligence by using the game of soccer as a
substrate challenge domain [3, 6, 39, 41, 52, 54, 57, 77, 90]. The long-term goal of RoboCup
is, by the year 2050, to build a full team of 11 humanoid robot soccer players that can beat
robotics Mobk082 July 9, 2007 5:34
FIGURE 1.1: An image of the AIBO and the field. The robot has a field-of-view of 56.9◦ (hor) and
45.2◦ (ver), by which it can use the two goals and four visually distinct beacons at the field corners for
the purposes of localization.
the best human soccer team on a real soccer field [42]. RoboCup is organized into several
different leagues, including a computer simulation league and two leagues that use wheeled
robots. The case study presented in this lecture concerns the development of a new team for
the Sony four-legged league1 in which all competitors use identical Sony AIBO robots and
the Open-R software development kit.2 Here, teams of four AIBOs, equipped with vision-
based sensors, play soccer on a color-coded field. Figure 1.1 shows one of the robots along
with an overhead view of the playing field. As seen in the diagram, there are two goals, one
at each end of the field and there is a set of visually distinct beacons (markers) situated at
fixed locations around the field. These objects serve as the robot’s primary visual landmarks for
localization.
The Sony AIBO robot used by all the teams is roughly 280 mm tall (head to toe) and
320 mm long (nose to tail). It has 20 degrees of freedom: 3 in its head, 3 in each leg, and 5
more in its mouth, ears, and tail. It is equipped with a CMOS color camera at the tip of its
nose with a horizontal field-of-view of 56.9◦ and a vertical field-of-view of 45.2◦ . Images are
captured at 30 frames per second in the YCbCr image format. The robot also has a wireless
1
http://www.tzi.de/4legged/.
2
http://openr.aibo.com/.
robotics Mobk082 July 9, 2007 5:34
INTRODUCTION 3
LAN card that allows for communication with other robots or an off-board computer. All
processing is performed on-board the robot, using a 576 MHz processor.3 Since all teams use
identical robots, the four-legged league amounts to essentially a software competition.
This lecture details both the development process and the technical details of its end
result, a new RoboCup team, called UT Austin Villa,4 from the Department of Computer
Sciences at The University of Texas at Austin. The main contributions are
1. A roadmap for new classes and research groups interested in intelligent autonomous
robotics who are starting from scratch with a new robot; and
2. Documentation of the algorithms behind our own approach on the AIBOs with the
goal of making them accessible for use on other vision-based and/or legged robot
platforms.
As a case study, this lecture contains significant material that is motivated by the specific
robot soccer task. However, the main general feature of the class and research program described
is that there was a concrete task-oriented goal with a deadline. Potential tasks other than soccer
include autonomous surveillance [1, 56], autonomous driving [50], search and rescue [51], and
anything else that requires most of the same subtask capabilities as robot soccer as described in
Chapter 2.
Though development on the AIBOs has continued in our group for several years after the
initial ramp-up, this lecture focuses extensively on the first year’s work as an example of starting
up education and research on a new robot from scratch. Some of the later years’ developments
are also documented as useful and appropriate.
In the four-legged league, as in all RoboCup leagues, the rules are changed over time
to make the task incrementally more difficult. For example, in the first year of competition
documented in this lecture (2003), the field was 2.9 m × 4.4 m and there were walls surrounding
the field. By 2005, the field had been enlarged to 3.6 m × 5.4 m and the walls were removed.
As such, some of the images and anecdotes in this lecture reflect slightly different scenarios.
Nonetheless, the basic flow of games has remained unchanged and can be summarized as
follows.
r Teams consist of four robots each;
r Games consist of two 10-minute halves with teams switching sides and uniform colors
at half-time;
3
These specifications describe the most recent ERS-7 model. Some of the details described in this lecture pertain to
the early ERS-210A that was slightly smaller, had slightly less image resolution, and a somewhat slower processor.
Nonetheless, from a high level, most of the features of these two models are similar.
4
http://www.cs.utexas.edu/~AustinVilla.
robotics Mobk082 July 9, 2007 5:34
Since some of these rules rely on human interpretation, there have been occasional
arguments about whether a robot should be penalized (sometimes hinging around what the
robot “intended” (!) to do). But, for the most part, they have been effectively enforced and
adhered to in a sportsmanlike way. Full rules for each year are available online at the four-
legged-league page cited above.
The following chapter outlines the structure of the graduate research seminar that was
offered as a class during the Spring semester of 2003 and that jump-started our project. At the
end of that chapter, I outline the structure of the remainder of the lecture.
robotics Mobk082 July 9, 2007 5:34
CHAPTER 2
The Class
The UT Austin Villa legged robot team began as a focused class effort during the Spring
semester of 2003 at The University of Texas at Austin. Nineteen graduate students, and one
undergraduate student, were enrolled in the course CS395T: Multi-Robot Systems: Robotic Soccer
with Legged Robots.1
At the beginning of the class, neither the students nor the professor (myself) had any
detailed knowledge of the Sony AIBO robot. Students in the class studied past approaches
to four-legged robot soccer, both as described in the literature and as reflected in publicly
available source code. However, we developed the entire code base from scratch with the goals
of learning about all aspects of robot control and of introducing a completely new code base to
the community.
Class sessions were devoted to students educating each other about their findings and
progress, as well as coordinating the integration of everybody’s code. Just nine weeks after their
initial introduction to the robots, the students already had preliminary working solutions to
vision, localization, fast walking, kicking, and communication.
The concrete goal of the course was to have a completely new working solution by the
end of April so that we could participate in the RoboCup American Open competition, which
happened to fall during the last week of the class. After that point, a subset of the students
continued working towards RoboCup 2003 in Padova.
The class was organized into three phases. Initially, the students created simple behaviors
with the sole aim of becoming familiar with Open-R.
Then, about two weeks into the class, we shifted to phase two by identifying key subtasks
that were important for creating a complete team. Those subtasks were
r Vision;
r Movement;
r Fall Detection;
r Kicking;
1
http://www.cs.utexas.edu/~pstone/Courses/395Tspring03.
robotics Mobk082 July 9, 2007 5:34
During this phase, students chose one or more of these subtasks and worked in subgroups
on generating initial solutions to these tasks in isolation.
By about the middle of March, we were ready to switch to phase three, during which
we emphasized “closing the loop,” or creating a single unified code-base that was capable of
playing a full game of soccer. We completed this integration process in time to enter a team in
the RoboCup American Open competition at the end of April.
The remainder of the lecture is organized as follows. Chapter 3 documents some of the
initial behaviors that were generated during phase one of the class. Next, the output of some of
the subgroups that were formed in phase two of the class, is documented in Chapters 4–8. Next,
the tasks that occupied phase three of the class are documented, namely those that allowed us to
put together the above modules into a cohesive code base (Chapters 9–13). Chapters 14 and 15
introduce our simulator and debugging and development tools, and Chapter 16 concludes.
In all chapters, emphasis is placed on the general lessons learned, with some of the more
AIBO-specific details left for the appendices.
robotics Mobk082 July 9, 2007 5:34
CHAPTER 3
Initial Behaviors
The first task for the students in the class was to learn enough about the AIBO, to be able to
compile and run any simple program on the AIBO.
The open source release of Open-R came with several sample programs that could
be compiled and loaded onto the AIBO right away. These programs could do simple tasks
such as the following.
L-Master-R-Slave: Cause the right legs to mirror manual movements of the left legs.
Ball-Tracking-Head: Cause the head to turn such that the pink ball is always in the center of
the visual image (if possible).
PID control: Move a joint to a position specified by the user by typing in a telnet window.
The students were to pick any program and modify it, or combine two programs in any
way. The main objective was to make sure that everyone was familiar with the process for
compiling and running programs on the AIBOs. Some of the resulting programs included the
following.
After becoming familiar with the compiling and uploading process, the next task for
the students was to become more familiar with the AIBO’s operating system and the Open-
R interface. To that end, they were required to create a program that added at least one new
robotics Mobk082 July 9, 2007 5:34
subject–observer connection to the code.1 The students were encouraged to create a new Open-
R object from scratch. Pattern-matching from the sample code was encouraged, but creating
an object as different as possible from the sample code was preferred.
Some of the responses to this assignment included the following.
r The ability to turn on and off LEDs by pressing one of the robots’ sensors.
r A primitive walking program that walks forward when it sees the ball.
r A program that alternates blinking the LEDs and flapping the ears.
After this assignment, which was due after just the second week of the class, the students
were familiar enough with the robots and the coding environment to move on to their more
directed tasks with the aim of creating useful functionality.
1
A subject–observer connection is a pipe by which different Open-R objects can communicate and be made
interdependent. For example, one Open-R object could send a message to a second object whenever the back
sensor is pressed, causing the second object to, for example, suspend its current task or change to a new mode of
operation.
robotics Mobk082 July 9, 2007 5:34
CHAPTER 4
Vision
The ability of a robot to sense its environment is a prerequisite for any decision making. Robots
have traditionally used mainly range sensors such as sonars and laser range finders. However,
camera and processing technology has recently advanced to the point where modern robots
are increasingly equipped with vision-based sensors. Indeed on the AIBO, the camera is the
main source of sensory information, and as such, we placed a strong emphasis on the vision
component of our team.
Since computer vision is a current area of active research, there is not yet any perfect
solution. As such, our vision module has undergone continual development over the course
of this multi-year project. This lecture focusses on the progress made during our first year
as an example of what can be done relatively quickly. During that time, the vision reached a
sufficient level to support all of the localization and behavior achievements described in the rest
of this lecture. Our progress since the first year is detailed in our 2004 and 2005 team technical
reports [79, 80], as well as a series of research papers [71–73, 76].
Our vision module processes the images taken by the CMOS camera located on the
AIBO. The module identifies colors in order to recognize objects, which are then used to
localize the robot and to plan its operation.
Our visual processing is done using the established procedure of color segmentation
followed by object recognition. Color segmentation is the process of classifying each pixel in an
input image as belonging to one of a number of predefined color classes based on the knowledge
of the ground truth on a few training images. Though the fundamental methods employed in
this module have been applied previously (both in RoboCup and in other domains), it has been
built from scratch like all the other modules in our team. Hence, the implementation details
provided are our own solutions to the problems we faced along the way. We have drawn some
of the ideas from the previous technical reports of CMU [89] and UNSW [9]. This module
can be broadly divided into two stages: (i) low-level vision, where the color segmentation and
region building operations are performed and (ii) high-level vision, wherein object recognition
is accomplished and the position and bearing of the various objects in the visual field are
determined.
robotics Mobk082 July 9, 2007 5:34
The remainder of this chapter presents detailed descriptions of the subprocesses of our
overall vision system. But first, for the sake of completeness, a brief overview of the AIBO
robot’s CMOS color camera is presented. The reader who is not interested in details of the
AIBO robot can safely skip to Section 4.2.
1. Indoor-mode : 2800 K.
2. FL-mode : 4300 K.
3. Outdoor-mode : 7000 K.
robotics Mobk082 July 9, 2007 5:34
VISION 11
This setting, as the name suggests, is basically a color correction system to accommodate
varying lighting conditions. The idea is that the camera needs to identify the ‘white point’ (such
that white objects appear white) so that the other colors are mapped properly. We found that
this setting does help in increasing the separation between colors and hence helps in better
object recognition. The optimum setting depends on the “light temperature” registered on
the field (this in turn depends on the type of light used, i.e., incandescent, fluorescent, etc.).
For example, in our lab setting, we noticed a better separation between orange and yellow
with the Indoor setting than with the other settings. This helped us in distinguishing the
orange ball from the other yellow objects on the field such as the goal and sections of the
beacons.
r Shutter Speed:
1. Slow: 1/50 s.
2. Mid: 1/100 s.
3. Fast: 1/200 s.
This setting denotes the time for which the shutter of the camera allows light to enter the
camera. The higher settings (larger denominators) are better when we want to freeze the action
in an image. We noticed that both the ‘Mid’ and the ‘Fast’ settings did reasonably well though
the ‘Fast’ setting seemed the best, especially considering that we want to capture the motion of
the ball. Here, the lower settings would result in blurred images.
r Gain:
1. Low: 0 dB.
2. Mid: 0 dB .
3. High: 6 dB.
This parameter sets the camera gain. In this case, we did not notice any major difference in
performance among the three settings provided.
The motivation behind using the NNr approach is that the colors under consideration
overlap in the YCbCr space (some, such as orange and yellow, do so by a significant amount).
Unlike other common methods that try to divide the color space into cuboidal regions (or a
collection of planes), the NNr scheme allows us to learn a color table where the individual blobs
are defined more precisely.
The original color space has three dimensions, corresponding to the Y, Cb, and Cr
channels of the input image. To build the color table (used for classification of the subsequent
images on the robot), we maintain three different types of color cubes in the training phase:
one Intermediate (IM) color cube corresponding to each color, a Nearest Neighbor cube, and
a Master (M) cube (the names will make more sense after the description given below). To
reduce storage requirements, we operate at half the resolution, i.e., all the cubes have their
numerical values scaled to range from 0 to 127 along each dimension. The cells of the IM cubes
are all initialized to zero, while those of the NNr cube and the M cube are initialized to nine
(the color black, also representing background).
Color segmentation begins by first training on a set of images using UT Assist, our
Java-based interface/debugging tool (for more details, see Chapter 15). A robot is placed at a
few points on the field. Images are captured and then transmitted over the wireless network
to a remote computer running the Java-based server application. The objects of interest (goals,
beacons, robots, ball, etc.) in the images are manually “labeled” as belonging to one of the
color classes previously defined, using the Image Segmenter (see Chapter 15 for some pictures
showing the labeling process). For each pixel of the image that we label, the cell determined by
the corresponding YCbCr values (after transforming to half-resolution), in the corresponding
IM cube, is incremented by 3 and all cells a certain Manhattan distance away (within 2 units)
robotics Mobk082 July 9, 2007 5:34
VISION 13
0 0 0 0 0 0 0 1 0 0
0 0 0 0 0 0 1 1 1 0
0 0 0 0 0 1 1 3 1 1
Cb
0 0 0 0 0 0 1 1 1 0
0 0 0 0 0 0 0 1 0 0
Cr
FIGURE 4.1: An example of the development of the color table, specifically the IM cube. Part (a)
shows the general coordinate frame for the color cubes. Part (b) shows a planar subsection of one of the
IM cubes before labeling. Part (c) depicts the same subsection after the labeling of a pixel that maps to
the cell at the center of the subsection. Here only one plane is shown—the same operation occurs across
all planes passing through the cell under consideration such that all cells a certain Manhattan distance
away from this cell are incremented by 1.
from this cell are incremented by 1. For example, if we label a pixel on the ball orange in the
image and this pixel corresponds to a cell (115, 35, 60) based on the intensity values of that
pixel in the image, then in the orange IM cube this cell is incremented by 3 while the cells such
as (115, 36, 61) and (114, 34, 60) (among others) which are within a Manhattan distance of
2 units from this cell, in the orange IM cube alone, are incremented by 1. For another example,
see Fig. 4.1.
The training process is performed incrementally, so at any stage we can generate a single
cube (the NNr cube is used for this purpose) that can be used for segmenting the subsequent
images. This helps us to see how “well-trained” the system is for each of the colors and serves as a
feedback mechanism that lets us decide which colors need to be trained further. To generate the
NNr cube, we traverse each cell in the NNr cube and compare the values in the corresponding
cell in each of the IM cubes and assign to this cell the index of the IM cube that has the
maximum value in this cell, i.e., ∀( p, q , r ) ∈ [0, 127],
When we use this color cube to segment subsequent images, we use the NNr scheme.
For each pixel in the test image, the YCbCr values (transformed to half-resolution) are used
to index into this NNr cube. Then we compute the weighted average of the value of this cell
and those cells that are a certain Manhattan distance (we use 2–3 units) around it to arrive at
a value that is set as the “numerical color” (i.e. the color class) of this pixel in the test image.
The weights are proportional to the Manhattan distance from the central cell, i.e., the greater
robotics Mobk082 July 9, 2007 5:34
9 3 1 1 9 9 3 1 1 9
3 1 3 3 3 3 1 3 3 3
3 3 1 3 3 3 3 3 3 3
1 1 3 3 3 1 1 3 3 3
9 1 3 3 9 9 1 3 3 9
(a) (b)
FIGURE 4.2: An example of the weighted average applied to the NNr cube (a two-dimensional
representative example). Part (a) shows the values along a plane of the NNr cube before the NNr scheme
is applied to the central cell. Part (b) shows the same plane after the NNr update for its central cell.
We are considering cells within a Manhattan distance of 2 units along the plane. For this central cell,
color label 1 gets a vote of 3 + 1 + 1 + 1 = 6 while label 3 gets a vote of 2 + 2 + 2 + 2 + 1 + 1 + 1 +
1 + 1 = 13 which makes the central cell’s label = 3. This is the value that is set as the classification
result. This is also the value that is stored in the cell in the M cube that corresponds to the central
cell.
this distance, the smaller the significance attached to the value in the corresponding cell (see
Fig. 4.2).
We do the training over several images (around 20–30) by placing the robot at suitable
points on the field. The idea here is to train on images that capture the beacons, goals, ball,
and the robots from different distances (and also different angles for the ball) to account for
the variations in lighting along different points on the field. This is especially important for the
orange ball, whose color could vary from orange to yellow to brownish-red depending on the
amount of lighting available at that point. We also train with several different balls to account for
the fact that there is a marked variation in color among different balls. At the end of the training
process, we have all the IM cubes with the corresponding cells suitably incremented. The NNr
operation is computationally intensive to perform on the robot’s processor. To overcome this,
we precompute the result of performing this operation (the Master Cube is used for this) from
the corresponding cells in the NNr color Cube, i.e., we traverse each cell of the M Cube and
compute the “Nearest Neighbor” value from the corresponding cells in the NNr cube. In other
words, ∀( p, q , r ) ∈ [0, 127] with a predefined Manhattan distance ManDist ∈ [3, 7],
VISION 15
where ∀(k 1 , k2 , k3 ) ∈ [0, 127],
Score(i) = ManDist − (| k 1 − p | + | k 2 − q | + | k 3 − r |) |
k1 ,k2 ,k3
(| k1 − p | + | k 2 − q | + | k 3 − r |) < ManDist
∧ NNrCube (yk1 , c b k2 , c r k3 ) = i. (4.3)
This cube is loaded onto the robot’s memory stick. The pixel-level segmentation process
is reduced to that of a table lookup and takes ≈ 0.120 ms per image. For an example of the
color segmentation process and the Master Cube generated at the end of it, see Fig. 15.3.
One important point about our initial color segmentation scheme is that we do not make
an effort to normalize the cubes based on the number of pixels (of each color) that we train on.
So, if we labeled a number of yellow pixels and a relatively smaller number of orange pixels,
then we would be biased towards yellow in the NNr cube. This is not a problem if we are
careful during the training process and label regions such that all colors get (roughly) equal
representation.
Previous research in the field of segmentation has produced several good algorithms, for
example, mean-shift [14] and gradient-descent based cost-function minimization [85]. But
these involve more computation than is feasible to perform on the robots. A variety of previous
approaches have been implemented on the AIBOs in the RoboCup domain, including the use
of decision trees [8] and the creation of axis-parallel rectangles in the color space [12]. Our
approach is motivated by the desire to create fully general mappings for each YCbCr value [21].
Table 4.1: This Table Shows the Basic RunRegion and BoundingBox Data Structures With Which
We Operate.
struct RunRegion {
int color; //color associated with the run region.
RunRegion* root; //the root node of the runregion.
RunRegion* listNext; //pointer to the next runregion in the current run length.
RunRegion* nextRun;
int xLoc; //x location of the root node.
int yLoc; //y location of the root node.
int runLength; // number of run lengths with this region.
int boundingBox; //the bounding box that this region belongs to.
};
struct BoundingBox {
BoundingBox* prevBox; //pointer to the previous bounding box.
BoundingBox* nextBox; // pointer to the next bounding box.
int ULx; //upper left corner x coordinate.
int ULy; //upper left corner y coordinate.
int LRx;
int LRy;
bool lastBox;
int valid;
int numRunLengths; //number of runlengths associated with this bounding box.
int numPixels; //number of pixels in this bounding box.
int rrcount;
int color; //color cooresponding to this bounding box.
RunRegion* listRR;
RunRegion* eoList;
float prob; //probability corresponding to this bounding box.
};
robotics Mobk082 July 9, 2007 5:34
VISION 17
Next, we need to merge the run-lengths/bounding boxes corresponding to the same object
together under the assumption that an object in the image will be represented by connected
run-lengths (see [58] for a description of some techniques for performing the merging). In the
second pass, we proceed along the run-lengths (in the order in which they are present in the
linked list) and check for pixels of the same color immediately below each pixel over which the
run-length extends, merging run-lengths of the same color that have significant overlap (the
threshold number of pixel overlap is decided based on experimentation: see Appendix A.1).
When two run-lengths are to be merged, one of the bounding boxes is deleted while the
other’s properties (root node, number of run-lengths, size, etc.) are suitably modified to include
both the bounding boxes. This is accomplished by moving the corresponding pointers around
appropriately. By incorporating suitable heuristics, we remove bounding boxes that are not
significantly large or dense enough to represent an object of interest in the image, and at the
end of this pass, we end up with a number of candidate bounding boxes, each representing a
blob of one of the nine colors under consideration. The bounding boxes corresponding to each
color are linked together in a separate linked list, which (if required) is sorted in descending
order of size for ease of further processing. Details of the heuristics used here can be found in
Appendix A.1.
When processing images in real time, the low-level vision components described in
Sections 4.2 and 4.3—color segmentation and blob formation—take ≈20 ms per frame (of the
available ≈33 ms).
Table 4.2: This Table Shows the Basic Visionobject and Associated Data Structures With Which We
Operate.
struct VisionObjects{
int NumberOfObjects; //number of vision obejcts in current frame.
BBox* ObjectInfo; //array of objects in view.
}
struct BBox {
int ObjID; //object ID.
Point ul; //upper left point of the bounding box.
Point lr; //lower right point of the bounding box.
double prob; //likelihood corresponding to this bounding box/object.
}
struct Point {
double x; //x coordinate.
double y; //y coordinate.
}
robotics Mobk082 July 9, 2007 5:34
VISION 19
through the list (not considering very small ones), once again based on heuristics on size, aspect
ratio, density, etc. To deal with cases with partial occlusions, which is quite common with the
ball on the field, we use the “circle method” to estimate the equation of the circle that best
describes the ball (see Appendix A.3 for details). Basically, this involves finding three points
on the edge of the ball and finding the equation of the circle passing through the three points.
This method seems to give us an accurate estimate of the ball size (and hence the ball distance)
in most cases. In the case of the ball, in addition to the check that helps eliminate spurious
blobs (floating in the air), checks have to be incorporated to ensure that minor misclassification
in the segmentation stage (explained below) do not lead to detection of the ball in places where
it does not exist.
Next, we tackle the problem of finding the beacons (six field markers, excluding the
goals). The identification of beacons is important in that the accuracy of localization of the
robot depends on the determination of the position and bearing of the beacons (with re-
spect to the robots) which in turn depends on the proper determination of the bounding
boxes associated with the beacons. Since the color pink appears in all beacons, we use that
as the focus of our search. Using suitable heuristics to account for size, aspect ratio, density,
etc., we match each pink blob with blue, green, or yellow blobs to determine the beacons.
We ensure that only one instance of each beacon (the most likely one) is retained. Addi-
tional tests are incorporated to remove spurious beacons: those that appear to be on the field
or in the opponents, floating in the air, inappropriately huge or tiny, etc. For details, see
Appendix A.4.
After this first pass, if the goals have not been found, we search through the candidate
blobs of the appropriate colors with a set of reduced constraints to determine the occurrence
of the goals (which results in a reduced likelihood estimate as we will see below). This is useful
when we need to identify the goals at a distance, which helps us localize better, as each edge of
the goal serves as an additional marker for the purpose of localization.
We found that the goal edges were much more reliable as inputs to the localization
module than were the goal centers. So, once the goals are detected, we determine the edges
of the goal based on the edges of the corresponding bounding boxes. Of course, we include
proper buffers at the extremities of the image to ensure that we detect the actual goal edges and
not the ‘artificial edges’ generated when the robot is able to see only a section of the goal (as a
result of its view angle) and the sides of the truncated goal’s bounding box are mistaken to be
the actual edges.
Next, a brief description of some of the heuristics employed in the detection of the
ball, goals, beacons, and opponents is presented. I begin by listing the heuristics that are
common to all objects and then also list those that are specific to goals, ball, and/or beacons.
robotics Mobk082 July 9, 2007 5:34
VISION 21
expected color to the area of the blob), the number of run-lengths enclosed, and a
minimum number of pixels. All these thresholds were determined experimentally,
and changing these thresholds changes the distance from which the goal can be
detected and the accuracy of detection. For example, lowering these thresholds can
lead to false positives.
(3) The aspect ratio is used to determine the likelihood, and the candidate is accepted
if and only if it has a likelihood measure above a predefined minimum.
(4) When doing a second pass for the goal search, we relax the constraints sli-
ghtly but proportionately a lower likelihood measure gets assigned to the goal, if
detected.
Note: For sample threshold values, see Appendix A.5.
r Ball-specific calculations:
(1) We use the ‘tilt-angle test’ to eliminate spurious blobs from consideration.
(2) In most cases, the ball is severely occluded, precluding the use of the aspect ratio
test. Nonetheless, we first search for an orange object with a high density and
an aspect ratio (1:1) that would detect the ball if it is seen completely and not
occluded.
(3) If the ball is not found with these tight constraints, we relax the aspect ratio
constraint and include additional heuristics (e.g., if the ball is close, even if it is
partially occluded, it should have a large number of run-lengths and pixels) that
help detect a bounding box around the partially occluded ball. These heuristics and
associated thresholds were determined experimentally.
(4) If the yellow goal is found, we ensure that the candidate orange ball does not occur
within it and above the ground (which can happen since yellow and orange are
close in color space).
(5) We check to make sure that the orange ball is found lower than the lower-most
beacon in the current frame. Also, the ball cannot occur above the ground, or
within or slightly below the beacon. The latter can occur if the white and/or yellow
portions of the beacon are misclassified as orange.
(6) We use the “circle method” to detect the actual ball size. But we also include checks
to ensure that in cases where this method fails and we end up with disproportion-
ately huge or very small ball estimates (thresholds determined experimentally), we
either keep the estimates we had before employing the circle method (and ex-
tend the bounding box along the shorter side to form a square to get the closest
approximation to the ball) or reject the ball estimate in the current frame. The
robotics Mobk082 July 9, 2007 5:34
Finally, we check for opponents in the current image frame. As in the previous cases,
suitable heuristics are employed both to weed out the spurious cases and to determine the
likelihood of the estimate. To identify the opponents, we first sort the blobs of the corresponding
color in descending order of size, with a minimum threshold on the number of pixels and run-
lengths. We include a relaxed version of the aspect ratio test and strict tilt-angle tests (an
“opponent” blob cannot occur much lower or much higher than the horizon when the robot’s
head has very little tilt and roll) to further remove spurious blobs (see Appendix A.2 and
Appendix A.7). Each time an opponent blob (that satisfies these thresholds) is detected, the
robot tries to merge it with one of its previous estimates based on thresholds. If it does not
succeed and it has less than four valid (previous) estimates, it adds this estimate to the list
of opponents. At the end of this process, each robot has a list that stores the four largest
bounding boxes (that satisfy all these tests) of the color of the opponent with suitable likelihood
estimates that are determined based on the size of the bounding boxes (see Appendix A.8).
Further processing of opponent estimates using the estimates from other teammates, etc.,
is described in detail in the section on visual opponent modeling (Section 4.6). Once the
processing of the current visual frame is completed, the detected objects, each stored as a
VisionObject are passed through the Brain (central control module as described in Chapter 10)
to the GlobalMap module wherein the VisionObjects are operated upon using Localization
routines.
VISION 23
the robot’s environment), we can arrive at estimates for the distance and bearing of the object
relative to the robot. The known geometry is used to arrive at an estimate for the variances
corresponding to the distance and the bearing. Suppose the distance and angle estimates for a
beacon are d and θ . Then the variances in the distance and bearing estimates are estimated as
1
varianced = · (0.1d ) (4.4)
bp
global
where Tlocal is the 2D-transformation matrix from local to global coordinates.
For the variances in the positions, we use a simple approach:
1
varx = var y = · (0.1d ) (4.7)
Oppprob
where the likelihood of the opponent blob, Oppprob is determined by heuristics (see Ap-
pendix A.8).
If we do not have any previous estimates of opponents from this or any previous frame,
we accept this estimate and store it in the list of known opponent positions. If any previous
estimates exist, we try to merge them with the present estimate by checking if they are close
enough (based on heuristics). All merging is performed assuming Gaussian distributions. The
basic idea is to consider the x and y position as independent Gaussians (with the positions as
the means and the associated variances) and merge them (for more details, see [84]). If merging
is not possible and we have fewer than four opponent estimates, we treat this as a new opponent
estimate and store it as such in the opponents list. But if four opponent estimates already exist,
we try to replace one of the previous estimates (the one with the maximum variance in the list of
opponent estimates and with a variance higher than the new estimate) with the new estimate.
Once we have traversed through the entire list of opponent bounding boxes presented by the
vision module, we go through our current list of opponent estimates and degrade all those
estimates that were not updated, i.e., not involved in merging with any of the estimates from
the current frame (for more details on the degradation of estimates, see the initial portions of
Chapter 11 on global maps). When each robot shares its Global Map (see Chapter 11) with its
teammates, these data get communicated.
When the robot receives data from its teammates, a similar process is incorporated.
The robot takes each current estimate (i.e., one that was updated in the current cycle) that is
communicated by a teammate and tries to merge it with one of its own estimates. If it fails
to do so and it has fewer than four opponent estimates, it accepts the communicated estimate
as such and adds it to its own list of opponent estimates. But if it already has four opponent
estimates, it replaces its oldest estimate (the one with the largest variance which is larger than
robotics Mobk082 July 9, 2007 5:34
VISION 25
the variance of the communicated estimate too) with the communicated estimate. If this is not
possible, the communicated estimate is ignored.
This procedure, though simple, gives reliable results in nearly all situations once the
degradation and merging thresholds are properly tuned. It was used both during games and in
one of the challenge tasks (see Appendix F.3) during RoboCup and the performance was good
enough to walk from one goal to the other avoiding all seven robots placed in its path.
The complete vision module as described in this chapter, starting from color segmentation
up to and including the object recognition phase takes ≈28 ms per frame, enabling us to process
images at frame rate. Though the object recognition algorithm described here does not recognize
lines in the environment, they are also a great source of information, and can be detected reliably
with a time cost of an additional ≈3 ms per frame, thus still staying within frame rate [73].
robotics Mobk082 July 9, 2007 5:34
26
robotics Mobk082 July 9, 2007 5:34
27
CHAPTER 5
Movement
Enabling a robot to move precisely and quickly is equally as essential to any interesting task,
including the RoboCup task, as is vision. In this chapter, our approach to AIBO movement
is introduced, including walking and the interfaces from walking to the higher level control
modules.
Just as vision-based robots are increasingly replacing robots with primarily range sensors,
it is now becoming possible to deploy legged robots rather than just wheeled robots. The
insights from this section are particularly relevant to such legged robotics.
The AIBO comes with a stable but slow walk. From watching the videos of past
RoboCups, and from reading the available technical reports, it became clear that a fast walk
is an essential part of any RoboCup team. The walk is perhaps the most feasible component
to borrow from another team’s code base, since it can be separated out into its own module.
Nonetheless, we decided to create our own walk in the hopes of ending up with something at
least as good, if not better, than that of other teams, while retaining the ability to fine tune it
on our own.
The movement component of our team can be separated into two parts. First, the
walking motion itself, and second, an interface module between the low-level control of the
joints (including both walking and kicking) and the decision-making components.
5.1 WALKING
This section details our approach to enabling the AIBOs to walk. Though it includes some
AIBO-specifics, it is for the most part a general approach that could be applied to other legged
robots with multiple controllable joints in each leg.
5.1.1 Basics
At the lowest level, walking is effected on the AIBO by controlling the joint angles of the legs.
Each of the four legs has three joints known as the rotator, abductor, and knee. The rotator is
a shoulder joint that rotates the entire leg (including the other two joints) around an axis that
runs horizontally from left to right. The abductor is the shoulder joint responsible for rotating
robotics Mobk082 July 9, 2007 5:34
MOVEMENT 29
We call the rotator, abductor, and knee angles J 1 , J 2 , and J 3 , respectively. The goal of
the forward kinematics is to define the function from J = (J 1 , J 2 , J 3 ) to p = (x, y, z), where
p is the location of the foot according to our model. We call this function K F (J ). We start
with the fact that when J = (0, 0, 0), K F (J ) = (0, 0, −2), which we call p 0 . This corresponds
to the situation where the leg is extended straight down. In this base position for the leg, the
knee is at the point (0, 0, −1). We will describe the final location of the foot as the result of a
series of three rotations being applied to this base position, one for each joint.
First, we associate each joint with the rotation it performs when the leg is in the base
position. The rotation associated with the knee, K (q , ), where q is any point in space, is a
rotation around the line y = 0, z = −1, counterclockwise through an angle of with the x-
axis pointing towards you. The abductor’s rotation, A(q , ), goes clockwise around the y-axis.
Finally, the rotator is R(q , ), and it rotates counterclockwise around the x-axis. In general (i.e.
when J 1 and J 2 are not 0), changes in J 2 or J 3 do not affect p by performing the corresponding
rotation A or K on it. However, these rotations are very useful because the forward kinematics
function can be defined as
K F (J ) = R(A(K ( p 0 , J 3 ), J 2 ), J 1 ). (5.1)
This formulation is based on the idea that for any set of angles J , the foot can be moved
from p 0 to its final position by rotating the knee, abductor, and rotator by J 3 , J 2 , and J 1 ,
respectively, in that order. This formulation works because when the rotations are done in that
order they are always the rotations K , A, and R. A schematic diagram of the AIBO after each
of the first two rotations is shown in Fig. 5.1.
It is never necessary for the robot to calculate x, y, and z from the joint angles, so the
above equation need not be implemented on the AIBO. However, it is the starting point for the
derivation of the Inverse Kinematics, which are constantly being computed while the AIBO is
walking.
+z +z
+y +x
J2
(0,0,−1)
J3
K(p 0 ,J 3 ) A(K( p 0, J 3), J 2 )
(0,0,−2)
p0
−z −z
(a) (b)
FIGURE 5.1: Schematic drawings of the AIBO according to our kinematics model. (a) This is a side
view of the AIBO after rotation K has been performed on the foot. (b) In this front view, rotation A has
also been performed.
Next, note that the shoulder, knee, and foot are the vertices of an isosceles triangle with
sides of length 1, 1, and d with central angle 180 − J 3 . This yields the formula
−1 d
J 3 = 2 cos . (5.3)
2
The inverse cosine here may have two possible values within the range for J 3 . In this case,
we always choose the positive one. While there are some points in three-dimensional space that
this excludes (because of the joint ranges for the other joints), those points are not needed for
walking. Furthermore, if we allowed J 3 to sometimes be negative, it would be very difficult for
our function K I to be continuous over its entire domain.
To compute J 2 , we must first write out an expression for K ( p 0 , J 3 ). It is (0, sin J 3 , 1 +
cos J 3 ). This is the position of the foot in Fig. 5.1(a). Then we can isolate the effect of J 2 as
follows. Since the rotation R is with respect to the x-axis, it does not affect the x-coordinate.
Thus, we can make use of the fact that the K F (J ), which is defined to be R(A(K ( p 0 , J 3 ), J 2 ), J 1 )
(Eq. (5.1)), has the same x-coordinate as A(K ( p 0 , J 3 ), J 2 ). Plugging in our expression for
robotics Mobk082 July 9, 2007 5:34
MOVEMENT 31
K ( p 0 , J 3 ), we get that
The result of this subtraction is normalized to be within the range for J 1 . This concludes
the derivation of J 1 through J 3 from x, y, and z. The computation itself consists simply of the
calculations in the four equations (5.2), (5.3), (5.6), and (5.7).
It is worth noting that expressions for J 1 , J 2 , and J 3 are never given explicitly in terms of x,
y, and z. Such expressions would be very convoluted, and they are unnecessary because the serial
computation given here can be used instead. Furthermore, we feel that this method yields some
insight into the relationships between the legs joint angles and the foot’s three-dimensional
coordinates.
There are many points q , in three-dimensional space, for which there are no joint angles
J such that FK (J ) = q . For these points, the inverse kinematics formulas are not applicable.
One category of such points is intuitively clear: the points whose distance from the origin is
greater than 2. These are impossible locations for the foot because the leg is not long enough
to reach them from the shoulder. There are also many regions of space that are excluded by the
angle ranges of the three joints. However, there is one unintuitive, but important, unreachable
region, which we call the impossible sphere. The impossible sphere has a radius of 1 and is
centered at the point (1, 0, 0). The following analysis explains why it is impossible for the foot
to be in the interior of this sphere.
robotics Mobk082 July 9, 2007 5:34
(x − 1)2 + y 2 + z2 < 1
x − 2x + 1 + y 2 + z2 < 1
2
x 2 + y 2 + z2 < 2x.
Substituting d for x 2 + y 2 + z2 and dividing by 2 gives us
d2
< x. (5.8)
2
Since J 3 = 2 cos−1 d2 (Eq. (5.3)), cos J23 = d2 , so by the double-angle formula, cos J 3 =
d2 2 2
2
− 1, or d2 = 1 + cos J 3 . Substituting for d2 , we get
This is precisely the condition, as discussed above, under which the calculation of J 2
breaks down. This shows that points in the illegal sphere are not in the range of FK .
Occasionally, our parameterized walking algorithm requests a position for the foot that
is inside the impossible sphere. When this happens, we project the point outward from the
center of the sphere onto its surface. The new point on the surface of the sphere is attainable,
so the inverse kinematics formulas are applied to this point.
MOVEMENT 33
C4
2 * C2
FIGURE 5.2: The foot traces a half-ellipse as the robot walks forward.
the foot spends in the air (so that t runs from 0 to 1). While the AIBO is walking forwards,
the value of x for any given leg is always constant. The values of y and z are given by
and
In these equations, C1 through C4 are four parameters that are fixed during the walk.
C1 determines how far forward the foot is and C3 determines how close the shoulder is to the
ground. The parameters C2 and C4 determine how big a step is and how high the foot is raised
for each step (Fig. 5.2). Our walk has many other free parameters, which are all described in
Section A.9.1.
y
(a) (b)
FIGURE 5.3: The main movement direction of the half-ellipses changes for (a) walking sideways and
(b) turning in place. (The dark squares indicate the positions of the four feet when standing still.)
Combinations of walking forwards, backwards, sideways, and turning are also possible
by simply combining the different components for the ellipses through vector addition. For
example, to walk forwards and to the right at the same time, at an angle of 45◦ to the y-axis, we
would make the ellipses point 45◦ to the right of the y-axis. Any combination can be achieved
as shown in Fig. 5.4.
In practice, the method described here worked well for combinations of forwards and
turning velocities, but we had difficulty also incorporating sideways velocities. The problem
was that, after tuning the parameters (Section 5.1.7), we found that the parameters that worked
well for going forwards and turning did not work well for walking sideways. It was not obvious
how to find common parameters that would work for combinations of all the three types of
velocities.
FIGURE 5.4: Combining forwards, sideways, and turning motions. Each component contributes a
vector to the combination. Dashed lines show the resulting vectors. (We show only half of the ellipse
lengths, for clarity.) With the vectors shown, the robot will be turning towards its right as it moves
diagonally forward and right.
robotics Mobk082 July 9, 2007 5:34
MOVEMENT 35
In situations where we needed to walk with a nonzero sideways velocity, we frequently
used a slower omnidirectional walk developed by a student in the Spring semester class.That
walk is called SPLINE WALK, while the one being described here is called PARAM WALK.
Section 5.2.3 discusses when each of the walks was used.
More recent walking engines by some other teams take a similar approach to the ones
described in this section, but are somewhat more flexible with regards to providing smooth om-
nidirectional motion and a set of different locus representations including rectangular, elliptic,
and Hermite curve shaped trajectories [2]. Although, those representations provide reasonable
results, the biggest constraint they have in common are that they are 2D representations. The
German team showed that representing loci in 3D yields a better walk both in terms of speed
and in maneuverability [20].
MOVEMENT 37
did not allow for a fast forward walk, but with them the velocities were roughly proportional to
the step distances for combinations of forward, turning, and sideways velocities.
The proportionality constants are determined by a direct measurement of the relevant
velocities. To measure forward velocity, we use a stopwatch to record the time the robot takes
to walk from one goal line to the other with its forward walking parameters. The time taken is
divided into the length of the field, 4200 mm, to yield the forward velocity. The same process
is used to measure sideways velocity. To measure angular velocity, we execute the walk with
turning parameters. Then we measure how much time it takes to make a certain number of
complete revolutions. This yields a velocity in degrees per second. Finally, the proportionality
constants were calculated by dividing the measured velocities by the corresponding step distance
parameters that gave rise to them.
Since the odometry estimates are used by localization (Chapter 8), the odometry cal-
ibration constants could be tuned more precisely by running localization with a given set of
odometry constants and observing the effects of the odometry on the localization estimates.
Then we could adjust the odometry constants in the appropriate direction to make localization
more accurate. We feel that we were able to achieve quite accurate odometry estimates by a
repetition of this process.
1. The lowest level, the “movement module,” resides in a separate Open-R object from the
rest of our code (as described in the context of our general architecture in Chapter 10)
and is responsible for sending the joint values to OVirtualRobotComm, the provided
Open-R object that serves as an interface to the AIBO’s motors.
2. One level above the movement module is the “movement interface,” which handles the
work of calculating many of the parameters particular to the current internal state and
sensor values. It also manages the inter-object communication between the movement
module and the rest of the code.
3. The highest level occurs in the behavior module itself (Chapter 12), where the decisions
to initiate or continue entire types of movements are made.
(ReadyEffector)
If state has changed, notify Brain. (ReceiveMovement)
Update Brain’s knowledge of MovementModule state.
...
calculate new joint values ...
...
change state if current action is finished
...
determine movement corresponding to current behavior
(MoveToNewAngles)
Send new joint values to robot.
Adjust motors to reflect new joint values. ...
(Movement.SendCommand)
(NewParamsNotify) Send movement request.
Change MovementModule state
according to received request.
FIGURE 5.5: Inter-object communication involving the movement module. Thick arrows represent
a message-containing information (from Subject to Observer); thin arrows indicate a message without
further information (from Observer to Subject). An arrow ending in a null marker indicates that the
message does nothing but enable the service to send another message.
The movement module shares three connections (“services”) with other Open-R objects:
one with the OVirtualRobotComm object mentioned above, and two with the Brain, the
Open-R object which includes most of our code (see Chapter 10 for a description of our general
architecture), including the C++ object corresponding to the movement interface described in
Section 5.2.2. It uses one connection with the Brain to take requests from the Brain for types
of high-level movement, such as walking in a particular direction or kicking. It then converts
them to joint values, and uses its connection with OVirtualRobotComm to request that joint
positions be set accordingly. These requests are sent as often as is allowed—every 8 ms. The
second connection with the Brain allows the movement module to send updates to the Brain
about what movement it is currently performing. Among other things, this lets the Brain know
when a movement it requested has finished (such as a kick). The flow of control is illustrated
by the arrows in Fig. 5.5 (the functions identified in the figure are defined below). Thick
arrows represent a message-containing information (from Subject to Observer); thin arrows
indicate a message without further information (from Observer to Subject). An arrow ending
in a null marker indicates that the message does nothing but enable the service to send another
message.
Because the movement module must send an Open-R message to OVirtualRobotComm
every time it wants to change a joint position, it is necessary for the movement module to keep an
robotics Mobk082 July 9, 2007 5:34
MOVEMENT 39
STATE DESCRIPTION
INIT Initial state
IDLE No leg motion, but joint gains are set (robot is standing)1
PARAM WALK Fastest walk
SPLINE WALK Omnidirectional slower walk
KICK MOTION Kicking
GETUP MOTION No joint position requests being sent to OVirtualRobotComm,
thus allowing built-in Sony getup routines control over all motors
internal state so that it can resume where it left off when OVirtualRobotComm returns control
to the movement module. Whenever this happens, the movement module begins execution with
the function ReadyEffector, which is called automatically every time OVirtualRobotComm
is ready for a new command. ReadyEffector calls the particular function corresponding to the
current movement module state, a variable that indicates which type of movement is currently
in progress. Many movements (for example, walking and kicking) require that a sequence of
sets of joint positions be carried out, so the functions responsible for these movements must
be executed multiple times (for multiple messages to OVirtualRobotComm). The states of the
movement module are summarized in Table 5.1.
Whereas kicking and getting up require the AIBO’s head to be doing something specific,
neither the idle state nor the two walks require anything in particular from the head joints.
Furthermore, it is useful to allow the head to move independently from the legs whenever
possible (this allows the AIBO to “keep its eye on the ball” while walking, for instance). Thus
the movement module also maintains a separate internal state for the head. If the movement
module’s state is KICK MOTION or GETUP MOTION when ReadyEffector begins
execution, the new joint angles for the head will be specified by the function corresponding to
the movement module state. Otherwise, ReadyEffector calls a function corresponding to the
current head state, which determines the new joint angles for the head, and the rest of the joint
angles are determined by the function for the current movement module state. A summary of
the head states appears in Table 5.2.
1
In practice, this is implemented by executing a “walk” with forward velocity, side velocity, turn velocity, and leg
height, all equal to 0.
robotics Mobk082 July 9, 2007 5:34
STATE DESCRIPTION
IDLE Head is still (but joint gains are set)
MOVE Moving head to a specific position
SCAN Moving head at a constant speed in one direction
KICK Executing a sequence of head positions
The movement module listens for commands with a function called NewParamsNotify.
When the Brain sends a movement request, NewParamsNotify accepts it and sets the move-
ment module state and/or head state accordingly. When the internal state is next examined—
this occurs in the next call to ReadyEffector (that is, after the next time the joint positions are
set by OVirtualRobotComm)—the movement module begins executing the requested move-
ment. See Table 5.3 for a summary of the possible requests to the movement module. Note that
both a head movement and a body movement may be requested simultaneously, with the same
message. However, if the body movement that is requested needs control of the head joints,
the head request is ignored.
MOVEMENT 41
TYPE OF ASSOCIATED
REQUEST EXPLANATION PARAMETERS
MOVE NOOP don’t change body movement
MOVE STOP stop leg movement
MOVE PARAM WALK start walking x-velocity, y-velocity,
using ParamWalk angular velocity
MOVE SPLINE WALK start walking using x-destination,
SplineWalk y-destination,
angular destination
MOVE KICK execute a kick type of kick
MOVE GETUP get up from a fall
DONE GETUP robot is now upright,
resume motions
HEAD NOOP don’t change head movement
HEAD MOVE move head to a specific angle
HEAD SCAN scan head at constant velocity scan speed, direction
HEAD KICK kick with the head type of kick
HEAD STOP stop head movement
fall. It also provides several functions for more complex movements, which are described
here.
Head Scan
When searching for the ball, it is helpful to move the head around in some fashion so that more
of the field can be seen. On one hand, the more quickly the field can be covered by the scan, the
more quickly the ball can be found. On the other hand, if the head moves too quickly, the vision
will not be able to recognize the ball, because it will not be in sight for the required number of
frames. Therefore, it makes sense to try to cover as much of the field with as little head movement
as possible. At first, we believed that it was not possible to cover the entire height of the field
robotics Mobk082 July 9, 2007 5:34
Follow Object
Once the robot sees the ball, walking towards it is achieved by two simultaneous control laws.
The first keeps the head pointed directly at the ball as the ball moves in the image. This is
achieved by taking the horizontal and vertical distances between the location of the ball in the
image and the center of the image and converting them into changes in the head pan and tilt
angles.
Second, the AIBO walks towards the direction that its head is pointing. It does this
by walking with a combination of forward and turning velocities. As the head’s pan angle
changes from the straight ahead position towards a sidewise-facing position, the forward
velocity decreases linearly (from its maximum) and the turning velocity increases linearly (from
zero). In combination, these policies bring the AIBO towards the ball.
While we were able to use the above methods to have the AIBO walk in the general
direction of the ball, it proved quite difficult to have the AIBO reliably attain control of the
ball. One problem was that the robot would knock the ball away with its legs as it approached
the ball. We found that if we increased the proportionality constant of the turning velocity, it
would allow the robot to face the ball more precisely as it went up to the ball. Then the ball
would end up between the AIBO’s front legs instead of getting knocked away by one of them.
Another problem that arose was that the AIBO occasionally bumped the ball out of the way
with its head. We dealt with this by having the robot keep its head pointed 10◦ above the ball.
Both of these solutions required some experimentation and tuning of parameters.
Track Object
This function follows a ball with the head, and turns the body in place when necessary, so as
not to lose sight of the ball. It is used chiefly for the goalie.
Strafe
Before we had localization in place, we needed a way to turn the robot around the ball so that
it could kick it towards the goal. The problem was that we needed to keep its head pointing
robotics Mobk082 July 9, 2007 5:34
MOVEMENT 43
down the field so it could see the goal, which made turning with the ball pinched underneath
the chin (see below) unfeasible. Strafing consisted of walking with a sideways velocity and a
turning velocity, but no forward velocity. This caused the AIBO to walk sideways in a circle
around the ball. During this time, it was able to keep its head pointed straight ahead so that it
could stop when it saw the goal.
2
The choice of heading to the offensive goal as the landmark for determining when the chin pinch turn should stop
is due to the fact that the chin pinch turn’s destination is often facing the opponent goal, as well as the fact that
there was already a convenient GlobalMap interface function that provided heading to the offensive goal. In theory,
anything else would work equally well.
3
That is, the behavior repeatedly sends requests to the movement interface to execute the chin pinch turn.
robotics Mobk082 July 9, 2007 5:34
Landmarks
C
B
A C’
B’
A’
FIGURE 5.6: The training environment for the gait learning experiments. Each AIBO times itself as
it moves back and forth between a pair of landmarks (A and A’, B and B’, or C and C’).
robotics Mobk082 July 9, 2007 5:34
MOVEMENT 45
forth between pairs of fixed landmarks. The AIBOs evaluated a set of gait parameters that they
received from a central computer by timing themselves as they walked. As the AIBOs explored
the gait policy space, they discovered increasingly fast gaits. This approach of automated gait-
tuning circumvented the need for us to tune the gait parameters by hand, and proved to be very
effective in generating fast gaits. Full details about this approach are given in [44] and video is
available on-line at www.cs.utexas.edu/~AustinVilla/?p=research/learned_walk.
Since our original report on this method [44], there has been a spate of research on
efficient learning algorithms for quadrupedal locomotion [10, 11, 13, 16, 40, 59–63]. A key
feature of our approach and most of those that have followed is that the robots time themselves
walking across a known, fixed distance, thus eliminating the need for any human supervision.
46
robotics Mobk082 July 9, 2007 5:34
47
CHAPTER 6
Fall Detection
Sony provides routines that enable the robot to detect when it has fallen and that enable it to
get up. Our initial approach was to simply use these routines. However, as our walk evolved,
the angle of the AIBO’s trunk while walking became steeper. This, combined with variations
between robots, caused several of our robots to think they were falling over every few steps and
to try repeatedly to get up. To remedy this, we implemented a simple fall-detection system of
our own.
The fall-detection system functions by noting the robot’s x- and y-accelerometer sensor
values each Brain cycle. If the absolute value of an accelerometer reading indicates that the robot
is not in an upright position for a number (we use 5) of consecutive cycles, a fall is registered.
It is also possible to turn fall-detection off for some period of time. Many of our kicks
require the AIBO to pass through a state which would normally register as a fall, so fall-detection
is disabled while the AIBO is kicking. If the AIBO falls during a kick, the fall-detection system
registers the fall when the kick is finished, and the AIBO then gets up.
robotics Mobk082 July 9, 2007 5:34
48
robotics Mobk082 July 9, 2007 5:34
49
CHAPTER 7
Kicking
The vision and movement are generally applicable robot tasks. Any vision-based robot will
need to be equipped with some of the capabilities described in Chapter 4 and any legged robot
will need to be equipped with some of the capabilities described in Chapter 5. Because of the
particular challenge task we adopted—robot soccer—creating a ball-kicking motion was also
important. Though kicking itself may not be of general interest, it represents tasks that require
precise fine motor control. In this chapter, we give a brief overview of our kick-generation
process. Details of most of the actual kicks are presented in Appendix B.
The robot’s kick is specified by a sequence of poses. A Pose = ( j1 , . . . , jn ), ji ∈ ,
where j represents the positions of the n joints of the robot. The robot uses its PID mechanism
to move joints 1 through n from one Pose to another over a time interval t. We specify
each kick as a series of moves {Move 1 , . . . , Move m } where a Move = (Pose i , Pose f , t) and
Move j Pose f = Move ( j +1)Pose i , ∀ j ∈ [1, m − 1]. All of our kicks only used 16 of the robot’s
joints (leg, head, and mouth). Table 7.1 depicts the used joints and joint descriptions.
In the beginning stages of our team development, our main focus was on creating modules
(Movement, Vision, Localization, etc.) and incorporating them with one another. Development
of kicks did not become a high priority until after the other modules had been incorporated.
However, once it became a focus issue, and after generating a single initial kick [78], we soon
realized that we would need to create several different kicks for different purposes. To that end,
we started thinking of the kick-generation process in general terms. This section formalizes
that process.
The kick is an example of a fine-motor control motion where small errors matter. Creation
of a kick requires special attention to each Pose. A few angles’ difference could affect whether
the robot makes contact with the ball. Even a small difference in t in a Move could affect the
success of a kick. To make matters more complicated, our team needed the kick to transition
from and to a walk. More consideration had to be taken to ensure that neither the walk nor the
kick disrupted the operation of the other.
We devised a two-step technique for kick generation:
KICKING 51
the path the robot took from Pose a to Pose b . Thus, we abstracted away the t of the Move
by selecting a large t that enabled us to watch the path from Pose a to Pose b . We typically
selected t to be 64. Since movement module requests are sent every 8 ms, this Move took
64 × 8 = 512 ms to execute.
If the Move did not travel a path that allowed the robot to kick the ball successfully, we
then added an intermediary Pose x between Pose a and Pose b and created a sequence of two
Moves {(Pose a , Pose x , ti ), (Pose x , Pose b , ti+1 )} and watched the execution. Again, we
abstracted away ti and ti+1 , typically selecting 64. After watching the path for this sequence
of Moves, we repeated the process if necessary.
After we were finally satisfied with the sequence of Moves in the critical action, we tuned
the t for each Move. Our goal was to execute each Move of the critical action as quickly as
possible. Thus, we reduced t for each Move individually, stopping if the next decrement
disrupted the kick.
53
CHAPTER 8
Localization
One of the most fundamental tasks for a mobile robot is the ability to determine its location
in its environment from sensory input. A significant amount of work has been done on this
so-called localization problem. Since it requires at least vision and preferably locomotion to
already be in place, localization was a relatively late emphasis in our efforts. In fact, it did not
truly come into place until after the American Open Competition at the end of April 2003
(four months into our development effort).1
One common approach to the localization problem is particle filtering or Monte Carlo
Localization (MCL) [86, 87]. This approach uses a collection of particles to estimate the global
position and orientation of the robot. These estimates are updated by visual percepts of fixed
landmarks and odometry data from the robot’s movement module (see Section 5.1.8). The
particles are averaged to find a best guess of the robot’s pose. MCL has been shown to be a
robust solution for mobile robot localization, particularly in the face of collisions and large,
unexpected movements (e.g. the “kidnapped robot” problem [27]). Although this method has
a well-grounded theoretical foundation, and has been demonstrated to be effective in a number
of real-world settings, there remain some practical challenges to deploying it in a new robotic
setting.
Most previous MCL implementations have been on wheeled robots with sonar or laser
sensors (e.g. [26, 27]). In comparison with our setting, these previous settings have the ad-
vantages of relatively accurate odometry models and 180◦ , or sometimes up to 360◦ , sensory
information. Although MCL has been applied to legged, vision-based robots by past RoboCup
teams [45, 46, 64], our work contributes novel enhancements that make its implementation
more practical.
In our first year of development, we began with a baseline implementation of Monte Carlo
Localization adapted from recent literature [64] that achieves a reasonable level of competence.
In the following year, we then developed a series of innovations and adjustments required to
improve the robot’s performance with respect to the following three desiderata:
1
This chapter is adapted from [70].
robotics Mobk082 July 9, 2007 5:34
All of these properties must be achieved within the limits of the robot’s on-board processing
capabilities.
In order to achieve these desiderata, we enhance our baseline implementation with the
following three additions:
8.1 BACKGROUND
In Monte Carlo Localization, a robot estimates its position using a set of samples called
particles. Each particle, x, y, θ , p , represents a hypothesis about the robot’s pose: its global
location (x, y) and orientation (θ ). The probability, p, expresses the robot’s confidence in this
hypothesis. The density of particle probabilities represents a probability distribution over the
space of possible poses.
Each operating cycle, the robot updates its pose estimate by incorporating information
from its action commands and sensors. Two different probabilistic models must be supplied to
perform these updates. The Motion Model attempts to capture the robot’s kinematics. Given
the robot’s previous pose, and an action command, such as “walk forward at 300 mm/s” or
“turn at 0.6 rad/s,” the model predicts the robot’s new pose. Formally, it defines the probability
robotics Mobk082 July 9, 2007 5:34
LOCALIZATION 55
distribution, p(h |h, a), where h is the old pose estimate, a is the last action command executed,
and h
is the new pose estimate.
The Sensor Model is a model of the robot’s perceptors and environment. It predicts
what observations will be made by the robot’s sensors, given its current pose. The probabil-
ity distribution that it defines is p(o |h), where o is an observation such as “landmark X is
1200 mm away and 20 degrees to my right”, and h is again the old pose estimate.
Given these two models, we seek to compute p(h T |o T , a T−1 , o T−1 , a T−2 , . . . , a 0 ), where
T is the current cycle and h t , o t , and a t are the pose estimate, observation, and action command,
respectively, for time t.
h T ∼ p h T |h (i)
T−1 , a T−1 . (8.1)
Next, the probability of each particle is updated by the sensor model, based on the current
sensory data. The sensor model computes the likelihood of the robot’s observations given the
particle’s pose, and adjusts the particle’s probability accordingly. To prevent occasional bad
sensory data from having too drastic an effect, a particle’s change in probability is typically
limited by some filtering function, F( p old , p d e s ir e d ). The sensor model update is given by the
following equation:
(i)
p T(i) := F p T−1 , p o T |h (i)
T−1 . (8.2)
p (i)
m · m ( j)
. (8.3)
j =1 p
This description of the basic MCL algorithm specifies how we maintain a probabilistic
model of the robot’s location over time, but it omits several details. For instance, how do we
obtain the motion and sensor models? And how many particles do we need? Some previous
answers to these questions are surveyed in the following section.
robotics Mobk082 July 9, 2007 5:34
(l) e−50ωl ,
2
if ωl < 1
s αmeas , αexp =
(l)
(8.4)
e−50(2−ωl ) , otherwise
2
(l) (l)
α (me as −αe xp )
where ωl = π
. The probability, p, of a particle is then the product of these similar-
ities:
p= s αmeas
(l)
, αexp
(l)
. (8.5)
l∈L
Lastly, the following filtering function is applied to update the particle’s probability [64]:
p old + 0.1,
if p > p old + 0.1
p new = p old − 0.05, if p < p old − 0.05 (8.6)
p, otherwise.
An important aspect of this sensor model is that distances to landmarks are completely
ignored. The motivation for this restriction is that vision-based distance estimates are typically
quite noisy and the distribution is not easy to model analytically. Worse yet, there can be a
strong nonlinear bias in the distance estimation process which makes the inclusion of distance
estimates actively harmful, causing localization accuracy to degrade (see Section 8.3). In this
section, it is shown that the bias in distance estimates can be empirically modeled such that
they can be incorporated to improve sensor model updates.
robotics Mobk082 July 9, 2007 5:34
LOCALIZATION 57
2
In the baseline approach, to address the frequently-occurring kidnapped robot problem,
a few of the particles with low probability are replaced by estimates obtained by triangulation
from the landmarks seen in the current frame. This process, called reseeding, is based upon the
idea of Sensor Resetting localization [46].
A shortcoming of previous reseeding approaches is that they require at least two or three
landmarks to be seen in the same camera frame to enable triangulation. This section presents a
concrete mechanism that enables us to use reseeding even when two landmarks are never seen
concurrently.
Another significant challenge to achieving MCL on legged robots is that of obtaining a
proper motion model. In our initial implementation, we used a motion model that provided
reasonably accurate velocity estimates when the robot was walking at near-maximum speed.
However, when the robot was close to its desired location, moving at full speed caused jerky
motion. The resulting “noise” in the motion model caused erroneous action model updates
(Eq. (8.1)). The robot assumed that it was moving at full speed, but before its motion was
completed, it received a pose estimate beyond the target. This estimate generated another
motion command, leading to oscillation around the target position. This section examines
the effect on this oscillation of improving the motion model with an extension to the robot’s
behavior.3
2
In the RoboCup legged league, when a robot commits a foul, it is picked up by the referee and replaced at a different
point on the field.
3
Note that the noisy motion model is not a property of MCL or any other algorithm, but rather our own baseline
implementation.
robotics Mobk082 July 9, 2007 5:34
where tc and tlu represent the current time and the time of the last update. Then, observations
are corrected as
Next, to merge the observations corresponding to any one landmark i, we obtain the
distance, heading, and probability of the aggregate landmark observations as
psum i
psum i = p i, j , pi =
Mi
j
j p i, j di, j
di = , oi = p i, j o i, j . (8.9)
psum i j
Because the motion model is not very accurate and the robot can be picked up and moved
to a different location frequently, the history estimates are deleted if they are older than a
threshold in time or if the robot has undergone significant movement (linear and/or rotational).
That is, we remove observations from the history if any one of the following three conditions
are true:
In our current implementation, we use tth = 3 s, dth = 15.0 cm, and oth = 10.0◦ . These thresh-
olds were found through limited experimentation, though the algorithm is not particularly
sensitive to their exact values.
robotics Mobk082 July 9, 2007 5:34
LOCALIZATION 59
8.2.2 Distance-Based Updates
To use distances to landmarks effectively in localization, we must first account for the nonlinear
bias in their estimation. Initially, the estimation was performed as follows:
r The landmarks in the visual frame are used to arrive at displacements (in pixels) with
respect to the image center.
r Using similar triangles, along with knowledge of the camera parameters and the actual
height of the landmark, these displacements are transformed into distance and angle
measurements relative to the robot.
r Finally, we transform these measurements, using the measured robot and camera (tilt,
pan, roll) parameters to a frame of reference centered on the robot.
Using this analytic approach, we found that the distances were consistently underesti-
mated. The bias was not constant with distance to the landmarks, and as the distance increased,
the error increased to as much as 20%. This error actually made distance estimates harmful to
localization.
To overcome this problem, we introduced an intermediate correction function. We
collected data corresponding to the measured (by the robot) and actual (using a tape measure)
distances to landmarks at different positions on the field. Using polynomial regression, we
estimated the coefficients of a cubic function that when given a measured estimate, provided
a corresponding corrected estimate. That is, given measured values X and actual values Y , we
estimated the coefficients, a i , of a polynomial of the form
During normal operation, this polynomial was used to compute the corrected distance to
landmarks. Once this correction was applied, the distance estimates proved to be much more
reliable, with a maximum error of 5%. This increased accuracy allowed us to include distance
estimates for both probability updates and reseeding.
8.3.1 Simulator
Debugging code and tuning parameters are often cumbersome tasks to perform on a physical
robot. Particle filtering implementations require many parameters to be tuned. In addition,
the robustness of the algorithm often masks bugs, making them difficult to track down. To
assist us in the development of our localization code, we constructed a simulator. Conveniently,
the simulator has also proven to be useful for running experiments like the ones presented
below.
The simulator does not attempt to simulate the camera input and body physics of the
actual Sony AIBO. Instead, it interacts directly with the localization level of abstraction.
Observations are presented as distances and angles to landmarks relative to the robot. In return,
the simulator expects high-level action commands to move the robot’s head and body. The
movement and sensor models both add Gaussian noise to reflect real-world conditions. Details
can be found in Chapter 14.
r Overall accuracy;
r Time taken to navigate to a series of points;
r Ability to stabilize at a target point; and
r Ability to recover from collisions and “kidnappings.”
LOCALIZATION 61
13 6
YELLOW GOAL
8 12
BLUE GOAL
14
5 7 1
3
11
4
10 9
2
FIGURE 8.1: Points, numbered in sequence, that the robot walks during experimentation; arrows
indicate the robot’s target heading.
we found it necessary to use the simulator for the recovery experiments because it allowed us
to consistently reproduce collisions and kidnappings while accurately tracking the robot’s true
pose over time.
These conditions were chosen to test each enhancement in isolation and in combi-
nation. Further, it allows us to demonstrate that using landmark distances can be harmful
if the distance estimates are not properly corrected. In all the six conditions, we used the
4
In general, the robot is not able to scan the field constantly—we allow it to do so in these experiments for the
purpose of consistency.
robotics Mobk082 July 9, 2007 5:34
DISTANCES ANGLE
behavior-based motion model enhancement (Section 8.2.3). The impact of the extended mo-
tion model was tested separately on a task requiring especially accurate positioning (see Section
8.3.5).
The results of these experiments, averaged across ten runs each, are shown in Tables 8.1
and 8.2. The localization errors were computed as the distance between the robot’s center
and the target location when the robot indicated that it believed it had reached the target.
Significance is established using a Student’s t-test. The p-values measure the likelihood that
LOCALIZATION 63
each entry differs from the baseline algorithm (labeled None). We follow the convention of
using a p-value of < 0.05 (>95% confidence) to establish statistical significance.
From Table 8.1, it is evident that the error in pose is considerably reduced after
the addition of all the enhancements. There is a 50% reduction in position error with re-
spect to the baseline implementation. The improvement in orientation accuracy is even more
dramatic.
Though the addition of observation histories alone (HST ) does not significantly improve
accuracy, when combined with the other enhancements, there is a significant improvement
( p-value between DST + FA and All ≤ 10−15 ).
The general reluctance to utilize distance-based probability updates in localization is
explained by the fact that when they are used without accounting for the nonlinear bias through
function approximation, the error is more than even the baseline implementation. By itself,
FA produces a good reduction in error because the improved distance estimates lead to better
reseed values, even if they are not used for probability updates. Using both, i.e. DST + FA,
results in the biggest improvement (p-value = 10−15 w.r.t. DST) after All.
From Table 8.2, we see that the addition of all the enhancements does not significantly
increase the time taken to reach the target locations. However, using DST without incorporating
FA or HST does lead to worse time performance because the robot has trouble settling at a
point. Since the oscillation happens even with the extended motion model in place, it can be
attributed solely to bad probability updates. Though the average time taken is the lowest for
DST + FA, this was not significant (p-value between DST + FA and All is 0.51).
gets better reseed estimates. This confirms our hypothesis that landmark distances can be useful
enhancements.
TABLE 8.4: Errors With and Without the Extended Motion Model
ERROR
LOCALIZATION 65
8.3.6 Recovery
Finally, we performed a set of experiments using our simulator to test the effect of our enhance-
ments on the robot’s ability to recover from unmodeled movements. We tested our localization
algorithm against two types of interference that are commonly encountered in RoboCup soccer:
collisions and kidnappings. In both cases, we disrupt the robot once every 30 s (simulated) while
it attempts to walk a figure-8 path around the field and scan with its head. Collisions are simu-
lated by preventing the robot’s body from moving (unbeknownst to it so it continues to perform
motion updates) for 5 s. The robot’s head is permitted to continue moving. In the kidnapping
experiments, the robot is instantly transported to a random spot on the field, 1200 mm from
its previous location and given a random orientation. We test our enhancements by comparing
these scenarios to the case in which the robot simply walks a figure-8 pattern undisturbed.
Twice per second, we record the absolute error in robot’s pose estimate. We compare average
errors for 2 h of simulated time, corresponding to roughly 50 laps around the field. The results
are summarized in Tables 8.5 and 8.6.
From the distance error table, we can see that for every test condition, the collision and
kidnapping disturbances caused an increase in average error when compared to the undisturbed
scenario, as we would expect. However, the increase in error is much smaller when our en-
hancements were used. For instance, with all enhancements turned off, the kidnapped robot
problem causes almost a 10-fold increase in error. But when we include our enhancements, the
error is only increased by 56%. There is a corresponding improvement in angle accuracy.
When we look at the individual contributions of the enhancements, we see that for both
disturbance scenarios, the individual enhancements are better than no enhancements. Also,
in both cases, they have a larger effect when combined than when they are used individually.
This implies that both enhancements contribute to the improved recovery. Only in the case
of DST + FA, when comparing angle accuracy to the baseline for collisions, do we not see a
significant improvement. This is most likely explained by the limited effect of distance updates
on orientation accuracy in general.
Finally, looking at the “Undisturbed” column, we notice that the enhancements, if any-
thing, have a somewhat negative effect in the absence of interference. It would seem that this
finding is incompatible with the accuracy results reported in Table 8.1. However, the experi-
mental conditions in the two scenarios were significantly different. In particular, in the recovery
experiments, we did not wait for the robot’s estimate to stabilize before measuring the error.
This metric not only measures how well the robot localizes when it has good information, but
also how poorly it does when it is lost, which is why this metric is appropriate for testing the
recovery.
Although the distance error for HST is quite high, in the All case we do only slightly
worse than originally. Thus, although the algorithm has been tuned to work well in game
situations where collisions and kidnappings are common, we don’t give up much accuracy when
those disturbances do not occur.
LOCALIZATION 67
some potential pitfalls and their solutions when implementing particle filtering on legged robots
with limited-range of vision.
Though this section focusses on MCL, or particle filtering, approaches to localization,
which have been popular in recent years; other effective approaches also exist and have been
used on AIBOs and on other robots as well. One such example is the Extended Kalman Filter
(EKF), a modification of the classic Kalman Filter [38] to handle nonlinearities in the system.
The EKF has been extensively used for state estimation. On robots, it has been used for pose
estimation [31, 35, 47, 48] using both range data (from laser or sonar) and visual data (from a
camera). It has been used for localization in a distributed setting [49], and also for the problem
of simultaneous localization and mapping [37].
On the AIBOs in particular, it has been used successfully by several teams [17, 18, 66],
including the NuBots who won the four-legged RoboCup Competition in 2006. Other teams
have used a combination of particle filters and EKF for localization [7]. Gutmann and Fox [30]
provide a good comparison of the various localization algorithms tested on data obtained from
the robot soccer setting.
robotics Mobk082 July 9, 2007 5:34
68
robotics Mobk082 July 9, 2007 5:34
69
CHAPTER 9
Communication
Collective decision making is an essential aspect of a multiagent domain such as robot soccer.
The robots thus need the ability to share information among themselves. This chapter discusses
the methodologies we adopted to enable communication and the various stages of development
of the resulting communication module.
COMMUNICATION 71
game, we want our robots to notice this and be able to incorporate this knowledge into their
behavior.
Initially, we had a system that allowed us to tell if a robot was “hearing” from any other
robot—a blue LED on the top of the head would blink every time a message was received. This
was not sufficient—a robot could be hearing from only one other robot, but missing out on the
important information from the other 2 robots. In reality, the system is much more complex
than this. With 2 connections between each robot (each direction is a separate connection),
there are a total of 12 connections of which to keep track. Eventually, we decided to light up
four LEDs on the face (the new ERS-7 robots made this possible), one for each robot. If the
robot believes another robot to be “dead”, that is, it hasn’t heard from it in a while, then that
LED is not illuminated. This provides us with a very quick and easy way to establish both
whether communication is present (if the lights are illuminated) and what the quality of the
communication is (if the lights are blinking in and out).
73
C H A P T E R 10
General Architecture
Due to our bottom-up approach, we did not address the issue of general architecture until some
important modules had already taken shape. We had some code that cobbled together our vision
and movement components to produce a rudimentary but functional goal-scoring behavior (see
Section 12.1.1). Although this agent worked, we realized that we would need a more structured
architecture to develop a more sophisticated agent, particularly with the number of programmers
working concurrently on the project. The decision to adopt the architecture described below
did not come easily, since we already had something that worked. Implementing a cleaner
approach stopped our momentum in the short-term and required some team members to
rewrite their code, but we feel the effort proved worthwhile as we continued to combine more
independently-developed modules.
We designed a framework for the modules with the aim of facilitating further develop-
ment. We considered taking advantage of the operating system’s inherent distributed nature
and giving each module its own process. However, we decided that the task did not require
such a high degree of concurrent computation, so we organized our code into just two separate
concurrent objects (Fig. 10.1).
We encapsulated all of the code implementing low-level movement (Section 5.2.1)
in the MovementModule object. This module receives Open-R messages dictating which
movement to execute. Available leg movements include locomotion in a particular direction,
speed, and turning rate; any one of a repertoire of kicks; and getting up from a fallen position.
Additionally, the messages may contain independent directives for the head, mouth, and tail.
The MovementModule translates these commands into sequences of set points, which it feeds
as messages into the robot’s OVirtualRobotComm object. Note that this code inhabits its
own Open-R object precisely so that it can supply a steady stream of commands to the robot
asynchronously with respect to sensor processing and deliberation. For further details on the
movement module, see Section 5.2.1.
The Brain object is responsible for the remainder of the agent’s tasks: accept-
ing messages containing camera images from OVirtualRobotComm, communicating over
the wireless network, and deciding what movement command messages to send to the
MovementModule object. It contains the remaining modules, including Vision, Fall Detection,
robotics Mobk082 July 9, 2007 5:34
wireless network
Brain MovementModule
OVirtualRobotComm
FIGURE 10.1: A high-level view of the main Open-R objects in our agent. The robot sends visual
data to the Brain object, which sends movement commands to the MovementModule object, which
sends set points to the PID controllers in the robot. The Brain object also has network connections to
teammates’ Brain objects, the RoboCup game controller, and our UT Assist client (Chapter 15). Note
that this figure omits sensor readings obtained via direct Open-R API calls.
Localization, and Communication. These components thus exist as C++ objects within a single
Open-R object. The Brain itself does not provide much organization for the modules that com-
prise it. In large part, it serves as a container for the modules, which are free to call each other’s
methods.
From an implementation perspective, the Brain’s primary job is to activate the appropriate
modules at the appropriate times. Our agent’s “main loop” activates whenever the Brain receives
a new visual image from OVirtualRobotComm. Other types of incoming data, mostly from
the wireless network, reside in buffers until the camera instigates the next Brain cycle. Each
camera image triggers the following sequence of actions from the Brain:
Get Data: The Brain first obtains the current joint positions and other sensor readings from
Open-R. It stores these data in a place where modules such as Fall Detection can read them.
This means that we ignore the joint positions and sensor readings that OVirtualRobotComm
generates between vision frames.
Process Data: Now the Brain invokes all those modules concerning interpreting sensor input:
Vision, Localization, and Fall Detection. Note that for simplicity’s sake even Communication
data wait until this step, synchronized by inputs from the camera, before being processed.
Generally, the end-result of this step is to update the agent’s internal representation of its
external environment: the global map (see Chapter 11).
Act: After the Brain has taken care of sensing, it invokes those modules that implement acting,
described in Chapters 12 and 13. These modules typically don’t directly access the data
gathered by the Brain. Instead they query the updated global map.
robotics Mobk082 July 9, 2007 5:34
75
C H A P T E R 11
Global Map
Early in the development of our soccer-playing agent, particularly before we had function-
ing localization and communication, we chose our actions using a simple finite state machine
(see Chapter 12). Our sensory input and feedback from the Movement Module dictated
state transitions, so sensations had a relatively direct influence on the behavior. However,
once we developed the capability to locate our agents and the ball on the field and to com-
municate this information, such a direct mapping became impossible. We created the global
map to satisfy the need for an intermediate level of reasoning. The global map combines the
outputs of Localization from Vision and from Communication into a coherent picture of what
is happening in the game, and it provides methods that interpret this map in meaningful ways
to the code that governs behavior.
For example, if we consider the modeling of the opponents, we want our estimates of the
opponents to be as accurate as possible and we do not want new estimates to occur every frame.
robotics Mobk082 July 9, 2007 5:34
When robot A receives teammate position information from robot B, robot A always
assumes that B’s estimate of B’s position is better than A’s estimate of B’s position. Therefore,
robot A simply replaces its old position for B with the new position.
When a robot receives opponent information from another robot, it updates its current
estimate of opponent locations as described in Section 4.6.
If robot A has seen the ball recently when it receives a ball position update from robot
B, robot A ignores B’s estimate of ball position. If robot A hasn’t seen the ball recently, then it
merges its current estimate of the ball’s position with the position estimate that it receives from
robot B.
The basic idea behind having a global map is to make sharing of information possible
so that the team consisting of individual agents with limited knowledge of their surroundings
can pool the information to function better as a team. The aim is to have completely shared
knowledge but the extent to which this succeeds is dependent upon the ability to communicate.
Since the communication (see Chapter 9) is not fully reliable, we have to design a good strategy
(Chapter 12 describes our strategy and behaviors) that uses the available information to the
robotics Mobk082 July 9, 2007 5:34
GLOBAL MAP 77
maximum extent possible. Other modules can access the information in the GlobalMaps using
the accessor functions (predicates) described in the following section.
GetDistanceFromAttacker HeadingToOppLeftCorner
GetDistanceFromSupporter HeadingToOppRightCorner
robotics Mobk082 July 9, 2007 5:34
79
C H A P T E R 12
Behaviors
This chapter describes the robot’s soccer-playing behaviors. In our initial development, we had
relatively little time to focus on behaviors, spending much more of our time building up the low-
level modules such as walking, vision, and localization. As such, the behaviors described here
are far from ideal. We later overhauled this component of our code base when we participated
in subsequent competitions. Nonetheless, a detailed description of our initial year’s behaviors is
presented for the sake of completeness, and to illustrate what was possible in the time we had
to work.
BEHAVIORS 81
Offensive
Goal
more likely to be in the center of the field, a good strategy for accomplishing this is to kick
toward the outside so that the ball will on average be allowed to travel farther before its path is
obstructed.
When the robot is near the wall and facing it, the head kick is typically used. This is
chiefly because we want to use the chin pinch turn as little as possible when we are along the
wall. The more the robot runs into the wall while moving, the larger the discrepancy becomes
between the actual distance traveled and the information that odometry gives to localization.
Because the FSM uses localization to determine when to switch from chin pinch turning to
kicking, the longer the robot uses a chin pinch turn along a wall, the less likely it is to stop
turning at the right time. So, it is typically a better strategy when very near a wall and facing
robotics Mobk082 July 9, 2007 5:34
r Head Scan For Ball: This is the first of a few states designed to find the ball. While in
this state, the robot stands in place scanning the field with its head. We use a two-layer
head scan for this.
r Turning For Ball: This state corresponds to turning in place with the head in a fixed
position (pointing ahead but tilted down slightly).
r Walking To Unseen Ball: This state is for when the robot does not see the ball itself,
but one of its teammates communicates to it the ball’s location. Then the robot tries to
walk towards the ball. At the same time, it scans with its head to try to find the ball.
r Walking To Seen Ball: Here we see the ball and are walking towards it. During this
state, the robot keeps its head pointed towards the ball and walks in the direction that
robotics Mobk082 July 9, 2007 5:34
BEHAVIORS 83
its head is pointing. As the robot approaches the ball, it captures the ball by lowering
its head right before transferring into the Chin Pinch Turn state.
r Chin Pinch Turn: This state pinches the ball between the robot’s chin and the ground.
It then turns with the ball to face the direction it is trying to kick.
r Kicking: While in this state, the robot is kicking the ball.
r Recover From Kick: Here the robot updates its knowledge of where the ball is and
branches into another state. Both of these processes are influenced by which kick has
just been performed.
r Stopped To See Ball: In this state, the robot is looking for the ball and has seen it, but
still does not have a high enough confidence level that it is actually the ball (as opposed
to a false positive from vision). To verify that the ball is there, the robot momentarily
freezes in place. When the robot sees the ball for enough consecutive frames, it moves
on to Walking To Seen Ball. If the robot fails to see the ball, it goes back to the state
it was in last (where it was looking for the ball).
In order to navigate between these states, the FSM relies on a number of helper func-
tions and variables that help it make state transition decisions. The most important of these
are:
r BallLost: This Boolean variable denotes whether or not we are confident that we
see the ball. This is a sticky version of whether or not high level vision is report-
ing a ball, meaning that if BallLost is true, it will become false only if the robot
sees the ball (according to vision) for a number of consecutive frames. Similarly, a
few consecutive frames of not seeing the ball are required for BallLost to become
true.
r NearBall: This function is used when we are walking to the ball. It indicates when
we are close enough to it to begin capturing the ball with a chin pinch motion. It is
determined by a threshold value for the head’s tilt angle.
r DetermineAndSetKick: This function is used when transitioning out of Walking-
ToSeenBall upon reaching the ball. It determines whether or not a chin pinch turn is
necessary, what angle the robot should turn to with the ball before kicking, and which
kick should be executed.
Finally, an overview of the rules that govern how the states transition into one another is
given in Fig. 12.2.
robotics Mobk082 July 9, 2007 5:34
Ball is seen
FIGURE 12.2: The finite state machine that governs the behavior of the attacking robot.
12.2 GOALIE
Like the initial goal-scoring behavior described in Section 12.1.1, our initial goalie behavior was
developed without the benefit of localization and, as a result, is relatively crude. It is documented
fully in our 2003 technical report [78]. This section details our RoboCup-2003 goalie behavior.
Once our goalie had the ability to determine its position on the field (localization), our
primary strategy for the goalie focussed on staying between the ball and the goal. Given the
large size of the goalie with respect to the goal, we adopted a fairly conservative strategy that
kept the goalie in the goal most of the time.
Whenever the goalie saw the ball, it oriented itself such that it was pointed at the ball and
situated between the ball and the goal. If the ball came within a certain distance of the goal,
the goalie advanced towards the ball and attempted to clear it. After attempting to clear the
robotics Mobk082 July 9, 2007 5:34
BEHAVIORS 85
ball, the goalie retreated back into the goal, walking backwards and looking for the ball. Any
time the goalie saw the ball in a nonthreatening position, it oriented itself towards the ball and
continued its current course of action.
Whenever the ball was in view, the goalie kept a history of ball positions and time
estimates. This history allowed the goalie to approximate the velocity of the ball, which was
useful in deciding when the goalie should “stretch out” to block a shot on the goal.
One interesting dilemma we encountered concerned the tradeoff between looking at the
ball and looking around for landmarks. It seemed very possible that, given the goalie’s size, if
it could just stay between the ball and the goal it could to a fairly good job of preventing goals.
However, this strategy depended on the goalie both being able to keep track of its own position
and the ball’s position. When we programmed the goalie to fixate on the ball, it was not able
to see enough landmarks to maintain an accurate estimate of its own position. On the other
hand, when the goalie focussed on the beacons in order to stay localized, it would often miss
seeing the approaching ball. It proved to be very difficult to strike a balance between these two
opposing forces.
robotics Mobk082 July 9, 2007 5:34
86
robotics Mobk082 July 9, 2007 5:34
87
C H A P T E R 13
Coordination
This chapter describes our initial and eventual solutions to coordinating multiple soccer-playing
robots.
13.1 DIBS
Our first efforts to make the players cooperate resulted directly from our attempts to play
games with eight players. Every game would wind up with six robots crowded around the ball,
wrestling for control. At this point, we only had two weeks before our first competition, and
thus needed a solution that did not depend on localization, which was not yet functional. Our
solution was a process we called Dibs.
13.1.2 Thrashing
Unfortunately, this first attempt did not work so well. First of all, the robots’ perception of their
distance to the ball was very heavily dependent on how much of the ball they could see, how the
lights were reflecting off the surface of the ball, and how much of the ball was actually classified
as “orange.” This means that estimates of the ball’s distance varied wildly from brain cycle to
brain cycle, often by orders of magnitude in each direction. Secondly, even when estimates
were fairly stable, a robot could think that it was the closest to the ball, start to step, and in the
process move slightly backward, which would signal another robot to go for the ball. The other
robotics Mobk082 July 9, 2007 5:34
13.1.3 Stabilization
To correct these problems, we decided that re-evaluating which robot should go to the ball in
each brain cycle was too much. Evaluating that frequently didn’t give a robot the chance to
actually step forward (this was before our walk was fully developed as well), so that its estimate
of ball distance could decrease. However, we couldn’t just take measurements every n brain
cycles and throw away all the other information—we were strapped for information as it was,
and we didn’t want one noisy measurement to negatively affect the next n brain cycles of play.
Our solution was to take an average of the measurements over a period of time, and instead
only transmit them every n brain cycles.
13.1.5 Aging
To prevent deadlock, we introduced an aging system into Dibs. Originally, if a robot had
transmitted a very low estimate of distance to the ball, and then crashed or was removed from
play, other robot would just remain watching the ball, because they would still have the other
robot’s estimate in their memory. Thus, at the end of each transmit cycle, we incremented the
age of each other robot’s estimate. When the age reached a predetermined cutoff (ten in our
case), the estimate was discarded and set to the maximum value. In this way, other robots could
then resume attacking the ball.
COORDINATION 89
the goal, while another robot of ours would come up and take the ball right out from under
the nose of the first. Next, the second robot would start to strafe, and a large tangle of robots
would result. To prevent this, we added functions called “callBall” and “relinquishBall.” These
functions merely set flags that made the robot start lying about its distance to the ball and stop
lying, respectively. When lying about its distance to the ball, the robot would always report
zero as its distance estimate. This way, whenever the robot entered the strafing state, it could
effectively let the other robots know that even though it wasn’t seeing the ball, they shouldn’t
go after it. The robot would then relinquish the ball at the beginning of most states, including
when it had lost the ball and when it had just finished kicking the ball.
13.2.1 Roles
Our strategy uses a dynamic system of roles to coordinate the robots. In this system, each robot
has one of three roles: attacker, supporter, and defender. The goalie does not participate in the
role system. This section gives an overview of the ideas behind the roles. The following sections
robotics Mobk082 July 9, 2007 5:34
r An attacker robot focuses exclusively on goal-scoring. That is, it tries to find the ball,
move to it, and kick it towards the goal.
r The supporter’s actions are based on a couple of goals. One is to stay out of the way of
the attacker. This is based on the idea that one robot can score by itself more effectively
than two robots both trying to score at the same time. Another goal is to be well
placed so that if the attacker shoots the ball and it ricochets off the goalie or a wall, the
supporter can then become the attacker and continue the attack.
r Our defender stays on the defensive-half of the field at all times. Its job is to wait for
the ball to be on its half and then go to the ball and clear it back to the offensive side
of the field.
COORDINATION 91
by
1150, if x > 1450
Sx = (13.1)
1750, if x ≤ 1450
and
y + 4200
Sy = min , 3800 . (13.2)
2
General Rules
We label the three robots R1 , R2 , and R3 . Then the following rules influence R1 ’s choice of
role. (The rules are the same for each robot; the labels are to distinguish which robot’s role is
being determined presently.)
r The default is for each robot to keep its current role. It will only change roles if a
specific rule applies.
r If R1 finds that it is “alone” in that it has not been receiving communication from other
teammates for some time, it automatically assumes the role of an attacker.
r In most cases, communication works fine, and if neither R2 nor R3 is a defender, then
R1 will automatically become (or stay) a defender. This ensures that (under normal
conditions) there will always be at least one defender. Ensuring that there is not more
than one defender is taken care of in the section on attacker and defender switching.
r If R1 is a supporter and so is R2 or R3 , then R1 will automatically become an attacker.
This could happen accidentally if two supporters simultaneously decide to become
supporters without enough time in between for the second one to be aware of the
first’s decision. In this case, this rule ensures that at least one of the supporters will
immediately go back to being an attacker.
COORDINATION 93
hysteresis, which is a common and generally useful concept in robotics and control problems
when it is important to prevent quick oscillations between different behaviors.
An important measure that we use to evaluate a robot’s utility as an attacker is its kick
time. This is an estimate of the amount of time it will take the robot in question to walk up
to the ball, turn it towards the goal, and kick. Each robot calculates its own kick time and
communicates it to the other robots as part of their communication of strategic information.
The estimated amount of time to get to the ball is the estimated distance to the ball divided by
the forward speed. The time to turn with the ball is determined by calculating the angle that
the ball will have to be turned and dividing by the speed of the chin-pinch turn.
Consider the case where there are two attackers, A1 and A2 . Once A1 ’s period of stickiness
has expired, it will become a supporter precisely when all of the following conditions are met:
r A1 and A2 both see the ball. This helps to ensure the accuracy of the other information
being used.
r The ball is in the offensive half, as well as both robots A1 and A2 . Becoming a supporter
is only useful when our team is on the attack.
r A1 has a higher kick time than A2 . That is, A2 is better suited to attack, so A1 should
become the supporter.
Once we have a supporter, S, and the role is no longer stuck, it will turn back into an
attacker if any of the following conditions hold:
r S, the ball, or the attacker (A) go back into the defensive half.
r A and S both see the ball, and S’s estimate of its distance from the ball is smaller
than A ’s.
r A doesn’t see the ball, and S’s estimate of the ball’s distance from it is less than some
constant (300 mm).
r S has been a supporter for longer than some constant amount of time (12 s).
As mentioned above, our role system was developed quite hastily in the last week or
so before competition. However, the system performs quite appropriately during games. The
attacker/defender switches normally occur where they seem intuitively reasonable. The two
attackers (with one becoming a supporter periodically), trying to score a goal, frequently look
like a well-organized pair of teammates. Nonetheless, there are certainly some instances during
the games where we can point to situations where a role change happened at an inopportune
time, or where it seems like they should “know better” than to do what they just did. Finding
viable solutions to problems like this can be strikingly difficult.
robotics Mobk082 July 9, 2007 5:34
95
C H A P T E R 14
Simulator
Debugging code and fine-tuning parameters are often cumbersome to perform on a physical
robot. For example, the particle filtering implementation to localization that we used requires
many parameters to be tuned. In addition, the robustness of the algorithm often works to mask
errors, making them difficult to track down. For these reasons, we constructed a simulator
that allows the robot’s high-level modules to interact with a simulated environment through
abstracted low-level sensors and actuators. The following sections describe this simulator in
detail.
The following message types are sent from the server to the client:
r connect - Acknowledge client initialization.
r see - Send observations for this time step.
r sense - Send joint feedback for this time step.
r error - Respond to incorrectly formatted client message.
SIMULATOR 97
robot’s pose estimate from localization is displayed as a light (blue) triangle, and its particles
are shown as white dots.
The user can reposition the robot by clicking on the robot’s body with the left mouse
button, dragging, then releasing. The (orange) ball can be moved around in this way as well.
The robot’s orientation can be changed by clicking on the robot’s body with the right mouse
button, dragging up and down, then releasing. Additional controls allow the user to pause the
simulator and temporarily “blind” the robots by turning off sensory messages.
FIGURE 14.1: Screen shot of simulator. The robot’s true position is in dark (blue). Its estimated
position is in light (blue). The localization particles are shown in white. The ball estimate particles are
shown in red (grey).
robotics Mobk082 July 9, 2007 5:34
98
robotics Mobk082 July 9, 2007 5:34
99
C H A P T E R 15
UT Assist
In addition to the simulator, which we used primarily for developing and testing localization,
we developed a valuable tool to help us debug our robot behaviors and other modules. This
tool, which we called UT Assist, allowed us to experience the world from the perspective of our
AIBOs and monitor their internal states in real time.
r Low- and high-resolution images – UT Assist can transfer full resolution images from
the AIBO’s camera (76,032 bytes in size) as well as smaller versions of the same image
(4752 bytes) which lack the clarity of the high-resolution images but can be sent at a
much faster rate. The small images can typically be transmitted and displayed in 13 of a
second, whereas the large images take 3–4 s to be displayed.
r Color-segmented images – The AIBO can send color-segmented images to UT Assist.
The color-segmented images are the high-resolution images where each pixel has been
FIGURE 15.1: Several examples of visual data displayed in UT Assist. Part (a) contains a low-resolution
image with vision objects overlayed on top. Part (b) shows a color-segmented image. Part (c) contains a
low-resolution image, and (d) contains a high-resolution image.
robotics Mobk082 July 9, 2007 5:34
UT ASSIST 101
classified by the AIBO’s low-level vision as one of the several discrete colors (for more
details, see Section 4.2).
r Vision objects – UT Assist can also display bounding boxes around objects that the
AIBO’s high-level vision software has recognized. UT Assist can overlay these bound-
ing boxes on top of regular camera images, a technique that is useful for identifying
possible bugs in the vision system.
FIGURE 15.2: An overhead view of the field, as displayed on UT Assist. The AIBO’s position and
orientation are denoted by the small (blue) triangle. The white dots on the field represent the particle
distribution that the AIBO uses to determine its position and orientation. Also shown are an estimate
of error in the position (the oval), data from the IR sensor (the short line), and an estimate of the ball’s
position (the circle).
robotics Mobk082 July 9, 2007 5:34
1. The user requests an image from an AIBO. The server sends this request to the client
on the specified AIBO.
2. The vision module in the Brain of that AIBO responds by sending with a high-
resolution image back over the network to the server.
3. The image is displayed on the user’s screen, and the user is allowed to “paint” various
colors on the image (i.e., label the pixels on the image). The colors and the underlying
pixels of the image are paired and saved so that they can later be used to compute the
Intermediate (IM) cubes for the AIBO.
4. The user repeats this process, from step one, until she/he is satisfied with the resulting
color calibration. Then the Master (M) cube is generated, loaded on the memory stick,
and used by the AIBO for subsequent image classification.
While painting the images, the user can view what the image would look like had it been
processed with the current color cube (NNr cube) using the Nearest Neighbor (NNr) rule. The
user can also preview false-3D graphs of the YCbCr color space for each color. These represent
the IM color cubes. The user can also see the M cube, generated by applying a NNr scheme on
the NNr cube obtained by merging the IM cubes. For more details on color segmentation, see
Section 4.2.
robotics Mobk082 July 9, 2007 5:34
UT ASSIST 103
(a) (b)
(c) (d)
FIGURE 15.3: Color calibration for the AIBO using UT Assist. Part (a) shows the initial image as
viewed in the UT Assist Image Segmenter. In (b), the user has started to classify (label) image pixels by
painting various colors on them. Part (c) depicts the image after it has been classified, and (d) shows the
distribution of colors in the three-dimensional YCbCr space, i.e., the final Master Cube.
robotics Mobk082 July 9, 2007 5:34
104
robotics Mobk082 July 9, 2007 5:34
105
C H A P T E R 16
Conclusion
The experiences and algorithms reported in this lecture comprise the birth of the UT Austin
Villa legged-league robot team, as well as some of the subsequent developments. Thanks to a
focussed effort over the course of 5 1/2 months, we were able to create an entirely new code base
for the AIBOs from scratch and develop it to the point of competing in RoboCup 2003. The
efforts were the basis of a great educational experience for a class of mostly graduate students,
and then jump-started the Ph.D. research of many of them.
The results of our efforts in the RoboCup competitions between 2003 and 2005 are
detailed in Appendix F. Though the competitions are motivating, exciting, and fun, the main
goal of the UT Austin Villa robot legged soccer team is to produce cutting edge research
results. Between 2003 and 2005, we have used the AIBO robots and our RoboCup team code
infrastructure as the basis for novel research along several interesting directions. Here, the titles
of some of these research papers are listed. Full details are available from our team web-page at
http://www.cs.utexas.edu/~AustinVilla/?p=research.
Vision:
Localization
Joint control
Behavior learning
r Learning Ball Acquisition on a Physical Robot [23].
Robot surveillance
r Continuous Area Sweeping: A Task Definition and Initial Approach [1].
In summary, this lecture has been designed to provide a case study illustrating the
challenges and benefits of working with a new robot platform for the purposes of education
and research. Because the robot used was the Sony AIBO robot and the target task used was
robot soccer, some of the details provided are specific to that robot and that task. But the
intention is to convey general lessons that will continue to apply as new, more sophisticated,
but still affordable robots continue to be rolled out. In addition, because the AIBO is a vision-
based legged robot, many of the methods described are framed to apply to other robots that
are vision-based, legged, or both. As robot technology continues to advance, more and more
affordable robots that are suitable for class and research purposes are likely to share one or both
of these characteristics.
robotics Mobk082 July 9, 2007 5:34
107
1. ImgCenter y : This is the center of the image plane along the y-axis (x varies from 0 to
175 while y varies from 0 to 143) given by 143
2
= 71.
robotics Mobk082 July 9, 2007 5:34
(x − x1 )2 + (y − y1 )2 = 0
(x − x2 )2 + (y − y2 )2 = 0
(x − x3 )2 + (y − y3 )2 = 0.
robotics Mobk082 July 9, 2007 5:34
L1
P3 = (x3, y3)
P1
P2 = (x2, y2)
r
P1 = (x1, y1)
(x,y) center
P2
P3
L2
FIGURE A.1: Given three point P1 , P2 , P3 , we need to find the equation of the circle passing through
them.
Through the pair of points P1 , P2 and P2 , P3 , we can form two lines L1 , L2 . The equations
of the two lines are
y L1 = m L1 · (x − x1 ) + y1 (A.3)
y L2 = m L2 · (x − x2 ) + y2 , (A.4)
where
y2 − y1 y3 − y2
m L1 = , m L2 = (A.5)
x2 − x1 x3 − x2
The center of the circle is the point of intersection of the perpendiculars to lines L1 , L2 ,
passing through the mid points of segments P1 − P2 , (a) and P2 − P3 , (b). The equations of
the perpendiculars are obtained as
−1 x1 + x2 y1 + y2
yc −a = · x− + (A.6)
m L1 2 2
−1 x2 + x3 y2 + y3
yc −b = · x− + . (A.7)
m L2 2 2
robotics Mobk082 July 9, 2007 5:34
m L1 · m L2 · (y1 − y3 ) + (m L2 − m L1 ) · x2 + (m L2 · x1 − m L1 · x3 )
x= (A.8)
2 · (m L2 − m L1 )
(x1 − x3 ) + (m L1 − m L2 ) · y2 + (m L2 · y1 − m L1 · y3 )
y= . (A.9)
2 · (m L2 − m L1 )
This gives us all the parameters we need to get the size of the ball.
1. The two bounding boxes that form the two sections of the beacon are allowed to form
a legal beacon if the number of pixels and number of runlengths in each section are at
least one-half of that in the other section.
2. None of the two regions must be ‘too big’ in comparison with the other. ‘Too big’ refers
to cases where the x- and/or y-coordinate ranges of one section are greater than 3–3.5
times the corresponding ranges of the other bounding box. If ‘ul’ and ‘lr’ denote the
upper-left and lower-right corners of the bounding box, then, for the x-coordinates, we
would consider either of the following equations as the sufficient condition for rejection
of the formation of a beacon with these two blobs:
(b1.lrx − b1.ulx + 1) ≥ 3 · (b2.lrx − b2.ulx + 1)
(b2.lrx − b2.ulx + 1) ≥ 3 · (b1.lrx − b1.ulx + 1).
Similarly, for the y-coordinates
(b1.lry − b1.uly + 1) ≥ 3 · (b2.lry − b2.uly + 1)
(b2.lry − b2.uly + 1) ≥ 3 · (b1.lry − b1.uly + 1).
The runregion size checks (see Appendix A.1) help ensure the removal of beacons that
are too small.
3. The x-coordinate of the centroid of each section must lie within the x range values of
that of the other section, i.e.,
b1.ulx ≤ b2centroidx ≤ b1.lrx
b2.ulx ≤ b1centroidx ≤ b2.lrx.
robotics Mobk082 July 9, 2007 5:34
(a) (b)
FIGURE A.2: This figure shows the basic beacon size extension to compensate for partial occlusions.
Case (a) is an example of vertical extension while case (b) depicts an example of horizontal extension.
4. The distance between sections (in number of pixels) is also used to decide whether
two bounding boxes can be combined to form a beacon. In our case, this threshold is
3 pixels.
5. Size extension: In the case of the beacons, similar to the ball (though not to that large a
degree), occlusion by other robots can cause part of the beacon to be ‘chopped off’. On
the basis of our region merging and beacon-region-matching size constraints, we can
still detect beacons with a fair amount of occlusion. But once the beacons have been
determined, we extend the size of the beacon such that its dimensions correspond to
that of the larger beacon subsection determined (see Fig. A.2).
6. Beacon likelihoods: After the beacon dimensions have been suitably extended, we use
the estimated size and the known aspect ratio in the actual environment to arrive
at a likelihood measure for our estimation. The beacon has a Height : Width :: 2 : 1
aspect ratio. In the ideal case, we would expect a similar ratio in the image also.
So we compare the aspect ratio in the image with the desired aspect ratio and this
provides us with an initial estimate of the likelihood of the estimate. Whenever there
are multiple occurrences of the same beacon, this value is used as a criterion and
the ‘most-likely’ beacon values are retained for further calculations. Further, since the
occurrence of false positives is a greater problem (for localization) than the case where
we miss some beacon, we only use beacons with a likelihood ≥0.6 for localization
computations.
robotics Mobk082 July 9, 2007 5:34
1. We desire to be able to detect the goals accurately both when they are at a distance
and when they are really close to the robot and in each case the image captured by the
robot (of the goal) and hence the bounding boxes formed are significantly different.
So we use almost the same criteria as with other objects (runlengths, pixels, density,
aspect ratio, etc.) but specify ranges. The ranges are chosen appropriately; we first use
strict constraints and search for ideal goals that are clearly visible and are at close range
(this gets assigned a high likelihood—see next point), but if we fail to do so, we relax
the constraints and try again to find the goals. In addition, we have slightly different
parameters tuned for the yellow and blue goals because the identification of the two
goals differs based on the lighting conditions, robot camera settings, etc.
r Runlengths: at a minimum, we require 10–14.
1. For the basic detection of a blob as a candidate opponent blob, we use the constraints
on runlengths, number of pixels, etc., which decide the tradeoff between accuracy and
the maximum distance at which the opponents can be recognized.
r Pixel threshold: We set a threshold of 150–300 pixels.
r Runlengths: We require around 10.
2. Tilt test: The opponents cannot float much above the ground and cannot appear much
below the horizon (in the ground) with the robot’s head not being tilted much. The
threshold values here are 1◦ (high) and −10◦ (low).
robotics Mobk082 July 9, 2007 5:34
By a similar matrix transform, we can move from the global coordinate frame to the local
coordinate frame.
xl cos(θ ) − sin(θ ) − p x cos(θ ) − p y sin(θ ) Xg
yl = sin(θ ) cos(θ ) p x sin(θ ) − p y cos(θ ) Y g (A.12)
1 0 0 1 1
These are the equations that we refer to whenever we speak about transforming from
the local to global coordinates or vice versa. For more details on coordinate transforms in 2D
and/or 3D, see [67].
robotics Mobk082 July 9, 2007 5:34
(x_l, y_l)
y x
theta
(p_x, p_y)
(0,0) X
FIGURE A.3: This figure shows the basic global and local coordinate systems.
r Forward-step distance: How far forward the foot should move from its home position
in one step.
r Side-step distance: How far sideways the foot should move from its home position in
one step.
r Turn-step distance: How far each half step should be for turning.
r Front-shoulder height: How high from the ground the robot’s front legs’ J 1 and J 2
joints should be.
r Back-shoulder height: How high from the ground the robot’s back legs’ J 1 and J 2
joints should be.
r Ground fraction: What fraction of a step time the robot’s foot is on the ground. (The
rest of the time is spent with the foot in the air, making a half-ellipse.) Between 0 and
1. Has no unit.
robotics Mobk082 July 9, 2007 5:34
118
robotics Mobk082 July 9, 2007 5:34
119
Appendix B: Kicks
Kicking has been an area of continual refinement for our team. The general kick architecture
that was a part of our initial development is presented in Chapter 7. Six of the kicks developed
in the first year are summarized here.
TABLE B.4: Fall-Forward Kick Critical Action And {(Pose2 , Poseg , 32)}
We first isolated the kick from the walk. Table B.3 shows the critical action.
We then integrated the walk with the kick. There was no need to create an initial action
because any momentum resulting from (Pose y , Pose1 , 32) was in the forward direction (the
same direction we wanted the robot to fall). However, testing revealed that (Pose2 , Posez, t)
caused the robot to fall forward on its face every time. Although the robot successfully kicked
the ball, the robot could not immediately resume walking. In this situation, the robot had
to wait for its fall detection to trigger and tell it to get up before resuming the walk. The
get up routine triggered by fall detection was very slow. Thus, we found a Poseg such that
{(Pose2 , Poseg , 32), (Poseg , Posez, t)} does not hinder the robot’s ability to resume walking.
Table B.4 shows the critical action with Mo ve (Pose2 , Poseg , 32).
From the observation, it is noted that transitioning from Pose2 directly to Poseg is not
ideal. The robot would fall over 25% of the time during (Pose2 , Poseg , 32). Thus, we added Posew
to precede Poseg in the final action. Afterwards, the robot no longer fell over when transitioning
from the kick to the walk. Table B.5 shows the entire finely controlled action, consisting of the
critical action and the final action.
The fall-forward kick executed quickly and potentially moved the ball the entire distance
of the field (4.2 m). Unfortunately, the fall-forward kick did not reliably propel the ball directly
124
robotics Mobk082 July 9, 2007 5:34
125
Appendix C: TCPGateway
As described in Chapter 9, the RoboCup competitions provided a module called TCPGateway,
that was used in the RoboCup competitions and that abstracted away most of the low-level
networking, providing a standard Open-R interface in its place. This brief section describes
our interface with that module.
The TCPGateway configuration files insert two network addresses and ports in the
middle of an Open-R subject/observer relationship. This creates the following situation:
r Instead of sending data directly to the intended observer, the subject on the initiating
robot sends data to a TCPGateway observer.
r The TCPGateway module on the initiating robot has a specific connection on a unique
port for data flowing in that direction, and sends the data from the subject over that
connection to the PC.
r The PC, which has been configured to map data from one incoming port to one
outgoing port, sends the subject’s data out to the receiving robot on a specific port.
r The TCPGateway module on the receiving robot processes the data that it receives on
this port and sends the data to the intended observer.
All of the mappings described above were defined in two files on each robot
(CONNECT.CFG and ROBOTGW.CFG) and in two files on the PC (CONNECT.CFG and
HOSTGW.CFG).
robotics Mobk082 July 9, 2007 5:34
126
robotics Mobk082 July 9, 2007 5:34
127
r Time information: the current time, the amount of time spent processing this Brain
cycle, the amount of time spent processing the last Brain cycle.
r The current score of the game.
r Information about the state of each teammate: their positions, roles, ideas about the
location of the ball.
r Which teammates are still “alive” (i.e. not crashed).
r The position of the AIBO on the field.
r The position of the ball and a history of where it has been.
r What objects the AIBO currently sees.
r Matrices that describe the current body position of the AIBO.
r A history of which electrostatic sensors have been activated, and for how long.
r All of the communication data, including messages from our teammates and from the
game controller.
Our WorldState object also provided 132 functions that access these data. Some of these
functions return the data in its original form, but most of them process the data to provide
a higher-level view of that data. For example, there are a number of ways we can access the
position of the AIBO on the field:
One notable addition to the WorldState object for 2004 was a distributed representation
of the ball. Instead of only having a single position estimate for the ball, this distributed repre-
sentation used a variable number (in our case, 20) of particles, where each particle represented
one possible location for the ball. A probability was associated with each particle that indicated
how confident we were in that estimate of the ball’s position. While this approach was originally
inspired by particle filtering (an algorithm used by our localization algorithm—see Chapter 8),
it was implemented as a much simpler algorithm. In order to get the behavior that we wanted
out of these “ball particles”, we dramatically reduced the effect of the probability update step,
instead choosing to update the particles almost entirely by reseeding. Particles were reseeded
(i.e. one particle was added to the filter, replacing a low-probability particle) whenever the
AIBO saw the ball for two consecutive frames or whenever one of its teammates was able to
see the ball and communicate that information to the AIBO.
One way that we used the information from these ball particles was by averaging the
particles in one angular dimension that centered around the AIBO, so that we could use
behaviors that required that we pick a single angle for the ball (for example, in order to look
at the ball we would want to know what angle to turn the head to). The method that we used
to extract a single ball position from the ball particles was to cluster the particles according to
their (x, y) coordinates on the field, in a manner similar to that of our localization algorithm
(for more information about this clustering algorithm, see the localization section in our 2003
technical report [78]). Having a distributed representation of the ball allowed the AIBOs to
deal gracefully with conflicting reports of the ball’s position caused by vision and localization
errors.
Some of the information in the WorldState is shared between the AIBOs, using the
communication framework described in Chapter 9. One of the first steps in each Brain cycle
involves the AIBO checking for information from other teammates and inserting that informa-
tion into its own WorldState. At the end of each Brain cycle, each AIBO sends some of its own
information to its teammates. The following information is transmitted between AIBOs:
r Ping/pong messages—used to determine which AIBOs are alive and connected to the
network.
robotics Mobk082 July 9, 2007 5:34
130
robotics Mobk082 July 9, 2007 5:34
131
r (param walk ∂ x ∂ y ∂θ )
Set the robot’s translational velocity to ∂ x, ∂ y in mm/s and its rotational velocity to
∂θ in degrees per second.
r (move head pan tilt)
Move robot’s head to angles pan and tilt in degrees.
The following sensation strings are sent as part of sense messages. Currently, only one
type of sensation is supported.
r (h pan tilt)
The robot’s head joint feedback returns angles pan and tilt in degrees.
robotics Mobk082 July 9, 2007 5:34
134
robotics Mobk082 July 9, 2007 5:34
135
1. Don’t see the ball off the field. There was a parquet floor and wooden doors in the
room. The robot appeared to often see them as orange and try to walk off the field.
2. Get localization into the goalie. At the time, we were using the initial goalie solution,
and the goalie often found itself stuck at the corner.
3. Start with a set play. When we had the kickoff, the other team often got to the ball
before we were ready to kick it. We decided to try reflexively kicking the ball and having
another robot walk immediately to the target of the kick.
robotics Mobk082 July 9, 2007 5:34
TABLE F.1: The Scores of Our Three Games at the American Open
4. Get a sideways head kick in. Whereas our initial goal-scoring behavior strafed around
the ball to look for the goal, the other team’s robots often walked to the ball and
immediately flicked it with their heads. That provided them with a big advantage.
5. Move faster. Most other teams in the tournament used CMU’s walk from their 2002
code. Those teams moved almost twice as fast as ours.
6. Kick more reliably and quickly. It was often clear that our robots were making good
decisions. However, their attempted kicks tended to fail or take too long to set up.
Some of these priorities—particularly the last two—were clearly too long-term to implement
in time for the competition. But they were important lessons.
Overnight, we added some preliminary localization capabilities to the goalie so that it
could get back into the goal more quickly. We also developed an initial set play and made
some progress toward kicking more quickly when the ball was near the goal (though not on the
rest of the field) by just walking forward when seeing the orange ball directly in front of the
goal.
Our set play for use on offensive kickoffs involved two robots. The first one, placed
directly in front of the ball, simply moved straight to the ball and kicked it directly forward—
without taking any time to localize first. We tended to place the robot such that it would kick
the ball towards an offensive corner of the field. The second robot, which was placed near the
corresponding center beacon, reflexively walked forwards until either seeing the ball or timing
out after moving for about half a meter.
On defensive kickoffs, our set play was simply to instruct one of the robots to walk
forwards as quickly as possible. We placed that robot so that it was facing the ball initially.
In the first game, we tied Tec do Monterrey 1–1 in a largely uneventful game. In the
penalty shootout, we lost 2–1. Though we only attained limited success, given the time we had
to develop our team, we were quite happy to score a goal and earn a point in the competition
(1 point for a shootout loss).
robotics Mobk082 July 9, 2007 5:34
Like at the American Open, we made sure to arrange some practice games in Italy.
The results are shown in Table F.3. Our first test match was against CMU, the defending
champions. We ended up with an encouraging 2–2 tie, but it was only their first test match as
well, with some things still clearly not working correctly yet. Overall, we were fairly satisfied
with our performance.
In our first “official” practice match (organized by the league chairs), we played against
the University of Washington team (3rd place at the American Open) and lost 1–0. It was a
fairly even game. They scored with 10 s left in the half (the game was just one-half). They also
had one other clear chance that they kicked the wrong way. In this game, it became apparent
that we had introduced some bugs while tuning the code since the day before. For example,
the fall detection was no longer working. We also noticed that our goalie often turned around
Given how much effort we needed to put in just to create an operational team in time
for the competition, we did not focus very much attention on the challenges until we arrived in
Italy. Nonetheless, we were able to do quite well, which we take as a testament to the strengths
of our overall team design.
On the first challenge, we finished in the middle of the pack. Our robot did not succeed
at getting all the way to the black and white ball (only eight teams succeeded at that), but of
all the teams that did not get to the ball, our robot was one of the closest to it, which was the
tie-breaking scoring criterion. Our rank in this event was 12th.
In this challenge event, we used our normal vision system with a change in high-level
vision for ball detection (see Section 4.4 for details on objection recognition). The black and
white ball appears almost fully white from a distance, i.e., in cases where we can see the entire
ball, and the algorithm first searched for such blobs whose bounding boxes had the required
aspect ratio (1:1). In other cases in which the ball is partially occluded, the ball was visualized as
being made up of black and white blobs, and the idea was to group similar sized blobs that were
significantly close to each other (a threshold determined by the experimentation). This required
us to also train on the black and white ball when building up the color cube (see Section 4.2
for details on color segmentation). This approach worked well in our lab, but on the day of the
challenge we did not have the properly trained color cube on the robot, which resulted in the
robot not being able to see the ball well enough to go to it.
robotics Mobk082 July 9, 2007 5:34
1
http://www.cs.uno.edu/~usopen04/.
robotics Mobk082 July 9, 2007 5:34
TABLE F.4: The Scores of Our Five Games at the U.S. Open
our three games are shown in Table F.4. Links to videos from these games are available at
http://www.cs.utexas.edu/~AustinVilla/?p=competitions/US_open_2004.
Our first game against ITAM earned us our first official win in any RoboCup competition.
The score was 3–0 after the first half and 8–0 at the end. The attacker’s adjustment mechanism
designed to shoot around the goalie (Section 12.1) was directly responsible for at least one of
the goals (and several in later games).
Both ITAM and Georgia Tech were still using the smaller and slower ERS-210 robots,
which put them at a considerable disadvantage. Dortmund, like us, was using the ERS-7 robots.
Although leading the team from Dortmund 2–1 at halftime, we ended up losing 4–2. But by
beating Georgia Tech, we still finished 2nd in the group and advanced to the semi-finals against
Penn, who won the other group.
Our main weakness in the game against Dortmund was that our goalie often lost track
of the ball when it was nearby. We focused our energy on improving the goalie, eventually
converging on the behavior described in Section 12.2, which worked considerably better.
Nonetheless, the changes weren’t quite enough to beat a good Penn team. Again we were
winning in the first half and it was tied 2–2 at halftime, but Penn managed to score the only
goal in the second half to advance to the finals.
Happily, our new and improved goalie made a difference in the third place game where
we won the rematch against Dortmund by a score of 4–3. Thus, we won the 3rd place trophy
at the competition!
We came away from the competition looking forward towards the RoboCup 2004 com-
petition two months later. Our main priorities for improvement were related to improving the
localization including the ability to actively localize, adding more powerful and varied kicks,
and more sophisticated coordination schemes.
robotics Mobk082 July 9, 2007 5:34
In this pool, Les 3 Mousquetaires and Chile were both using ERS-210 robots, while
the other teams were all using ERS-7s. In many of the first round games, the communication
among robots and the game controller was not working very well (for all teams), and thus
reduced performance on all sides. In particular, in the game against Penn, the robots had to be
started manually and were unable to reliably switch roles. In that game, we were winning 2–0,
but again saw Penn come back, this time to tie the game 3–3.
This result left us in a tie with Penn for first place in the group going into the final
game. Since the tiebreaker was goal difference, we needed to beat Chile by two goals more
than Penn beat ARAIBO in the respective last games in order to be the group’s top seed. Penn
proceeded to beat ARAIBO 9–0, leaving us in the unfortunate position of needing to score
2
http://www.robocup2004.pt/.
robotics Mobk082 July 9, 2007 5:34
TABLE F.6: The Scores of Our Five Games at the U.S. Open
The most exciting game was our first official matchup against CMU. It was a very close
game, with UT Austin Villa taking a 1–0 lead before eventually losing 2–1. Meanwhile, the
match against rUNSWift demonstrated clearly that our team was not quite at the top level of
the competition. Indeed two other teams (NuBots and the German Team) were also clearly
stronger than UT Austin Villa.
3
http://www.robocup2005.org.
robotics Mobk082 July 9, 2007 5:34
146
robotics Mobk082 July 9, 2007 5:34
147
References
[1] M. Ahmadi and P. Stone, “Continuous area sweeping: A task definition and initial
approach,” in 12th Int. Conf. Advanced Robotics, July 2005.
[2] H. L. Akın, Ç. Meriçli, T. Meriçli, K. Kaplan and B. Çelik, “Cerberus’05 team re-
port,” Technical report, Artificial Intelligence Laboratory, Department of Computer
Engineering, Boğaziçi University, Oct. 2005.
[3] M. Asada and H. Kitano, Eds. RoboCup-98: Robot Soccer World Cup II. Lecture Notes in
Artificial Intelligence, vol. 1604. Berlin: Springer Verlag, 1999.
[4] J. A. Bagnell and J. Schneider, “Autonomous helicopter control using reinforcement
learning policy search methods,” in Int. Conf. Robotics and Automation, IEEE Press,
2001, pp. 1615–1620.
[5] S. Belongie, J. Malik and J. Puzicha, “Shape matching and object recognition using shape
contexts,” Pattern Analysis and Machine Intelligence, April 2002.
[6] A. Birk, S. Coradeschi and S. Tadokoro, Eds., RoboCup-2001: Robot Soccer World Cup
V. Berlin: Springer Verlag 2002.
[7] G. Buchman, D. Cohen, P. Vernaza and D. D. Lee, “UPennalizers 2005 Team Re-
port,” Technical report, School of Engineering and Computer Science, University of
Pennsylvania, 2005.
[8] S. Chen, M. Siu, T. Vogelgesang, T. F. Yik, B. Hengst, S. B. Pham and C. Sammut,
RoboCup-2001: The Fifth RoboCup Competitions and Conferences. Berlin: Springer Verlag,
2002.
[9] ——, “The UNSW RoboCup 2001 Sony legged league team,” Tech-
nical report, University of New South Wales, 2001. Available at
http://www.cse.unsw.edu.au/~robocup/2002site/.
[10] W. Chen, “Odometry calibration and gait optimisation,” Technical report, The Univer-
sity of New South Wales, School of Computer Science and Engineering, 2005.
[11] S. Chernova and M. Veloso, “An evolutionary approach to gait learning for four-legged
robots,” in Proc. of IROS’04, Sept. 2004.
[12] D. Cohen, Y. H. Ooi, P. Vernaza and D. D. Lee. RoboCup-2003: The Seventh RoboCup
Competitions and Conferences. Berlin: Springer Verlag, 2004.
[13] ——, “The University of Pennsylvania RoboCup 2004 legged soccer team, 2004.” Avail-
able at URL http://www.cis.upenn.edu/robocup/UPenn04.pdf.
robotics Mobk082 July 9, 2007 5:34
REFERENCES 149
[31] J.-S. Gutmann, T. Weigel and B. Nebel, “A fast, accurate and robust method for self
localization in polygonal environments using laser range finders,” Advanced Robotics,
vol. 14, No. 8, 2001, pp. 651–668. doi:10.1163/156855301750078720
[32] B. Hengst, D. Ibbotson, S. B. Pham and C. Sammut, “Omnidirectional motion for
quadruped robots,” in A. Birk, S. Coradeschi and S. Tadokoro, Eds, RoboCup-2001:
Robot Soccer World Cup V. Berlin: Springer Verlag, 2002.
[33] G. S. Hornby, M. Fujita, S. Takamura, T. Yamamoto and O. Hanagata, “Autonomous
evolution of gaits with the Sony quadruped robot,” in W. Banzhaf, J. Daida, A. E.
Eiben, M. H. Garzon, V. Honavar, M. Jakiela and R. E. Smith, Eds, Proc. Genetic
and Evolutionary Computation Conf., vol. 2: Morgan Kaufmann: Orlando, Florida, USA,
1999, pp. 1297–1304.
[34] G. S. Hornby, S. Takamura, J. Yokono, O. Hanagata, T. Yamamoto and M. Fujita,
“Evolving robust gaits with AIBO,” in IEEE Int. Conf. Robotics and Automation, 2000,
pp. 3040–3045.
[35] L. Iocchi, D. Mastrantuono and D. Nardi, “A probabilistic approach to hough localiza-
tion,” in IEEE Int. Conf. Robotics and Automation, 2001.
[36] A. K. Jain and R. C. Dubes, Algorithms for Clustering Data. Prentice Hall, 1988.
[37] P. Jensfelt, J. Folkesson, D. Kragic and H. I. Christensen, “Exploiting distinguishable
image features in robotic mapping and localization,” in The European Robotics Symp.
(EUROS), 2006.
[38] R. E. Kalman, “A new approach to linear filtering and prediction problems,” Trans.
ASME, J. Basic Eng., vol. 82, pp. 35–45, Mar. 1960.
[39] G. A. Kaminka, P. U. Lima and R. Rojas, Eds., RoboCup-2002: Robot Soccer World Cup
VI. Berlin: Springer Verlag, 2003.
[40] M. S. Kim and W. Uther, “Automatic gait optimisation for quadruped robots,” in
Australasian Conf. Robotics and Automation, Brisbane, Dec. 2003.
[41] H. Kitano, Ed. RoboCup-97: Robot Soccer World Cup I. Berlin: Springer Verlag,
1998.
[42] H. Kitano, M. Asada, Y. Kuniyoshi, I. Noda and E. Osawa, “RoboCup: The robot world
cup initiative,” in Proc. First Int. Conf. Autonomous Agents, Marina Del Rey: California,
Feb. 1997, pp. 340–347. doi:10.1145/267658.267738
[43] N. Kohl and P. Stone, “Machine learning for fast quadrupedal locomotion,” in Nineteenth
National Conf. on Artificial Intelligence, July 2004, pp. 611–616.
[44] ——, “Policy gradient reinforcement learning for fast quadrupedal locomotion,” in Proc.
IEEE Int. Conf. Robotics and Automation, May 2004.
[45] C. Kwok and D. Fox, “Map-based multiple model tracking of a moving object,” in Int.
RoboCup Symp., Lisbon, 2004.
robotics Mobk082 July 9, 2007 5:34
REFERENCES 151
[61] T. Roefer, R. Brunn, S. Czarnetzki, M. Dassler, M. Hebbel, M. Juengel, T. Kerkhof,
W. Nistico, T. Oberlies, C. Rohde, M. Spranger and C. Zarges, “Germanteam 2005,” in
RoboCup 2005: Robot Soccer World Cup IX, Lecture Notes in Artificial Intelligence. Springer,
2005.
[62] T. Roefer, “Evolutionary gait-optimization using a fitness function based on proprio-
ception”. In “RoboCup 2004: Robot Soccer World Cup VIII” Springer Verlag 2005
p. 310–322.
[63] T. Rofer, H.-D. Burkhard, U. Duffert, J. Hoffman, D. Gohring, M. Jungel, M. Lotzach,
O. v. Stryk, R. Brunn, M. Kallnik, M. Kunz, S. Petters, M. Risler, M. Stelzer, I. Dahm,
M. Wachter, K. Engel, A. Osterhues, C. Schumann and J. Ziegler, “Germanteam
robocup 2003,” Technical report, 2003.
[64] T. Rofer and M. Jungel, “Vision-based fast and reactive Monte-Carlo Localization,” in
IEEE Int. Conf. Robotics and Automation, Taipei, Taiwan, 2003, pp. 856–861.
[65] C. Rosenberg, M. Hebert and S. Thrun, “Color constancy using kl-divergence,” in IEEE
Int. Conf. Computer Vision, 2001.
[66] C. Sammut, W. Uther and B. Hengst, “rUNSWift 2003 Team Report. Technical re-
port, School of Computer Science and Engineering, University of New South Wales,
2003.
[67] Robert J. Schilling, Fundamentals of Robotics: Analysis and Control. Prentice Hall, 2000.
[68] A. Selinger and R. C. Nelson, “A perceptual grouping hierarchy for appearance-
based 3d object recognition,” Comput. Vis. Image Underst., vol. 76, No. 1, pp. 83–92,
Oct. 1999. doi:10.1006/cviu.1999.0788
[69] J. Shi and J. Malik, “Normalized cuts and image segmentation,” in IEEE Transactions on
Pattern Analysis and Machine Intelligence (PAMI), 2000.
[70] M. Sridharan, G. Kuhlmann and P. Stone, “Practical vision-based Monte Carlo Local-
ization on a legged robot,” in IEEE Int. Conf. Robotics and Automation, April 2005.
[71] M. Sridharan and P. Stone, “Towards on-board color constancy on mobile robots,” in
First Canadian Conf. Computer and Robot Vision, May 2004.
[72] ——, “Autonomous color learning on a mobile robot,” in Proc. Twentieth National Conf.
Artificial Intelligence, July 2005.
[73] ——, “Real-time vision on a mobile robot platform,” in IEEE/RSJ Int. Conf. Intelligent
Robots and Systems, Aug. 2005.
[74] ——, “Towards illumination invariance in the legged league,” in D. Nardi, M. Riedmiller
and C. Sammut, Eds, RoboCup-2004: Robot Soccer World Cup VIII, Lecture Notes in
Artificial Intelligence, vol. 3276, Berlin: Springer Verlag, 2005, pp. 196–208.
[75] ——, “Towards eliminating manual color calibration at RoboCup,” in I. Noda, A.
Jacoff, A. Bredenfeld and Y. Takahashi, Eds., RoboCup-2005: Robot Soccer World Cup IX,
vol. 4020. Berlin: Springer Verlag, 2006, pp. 673–381. doi:10.1007/11780519 68
robotics Mobk082 July 9, 2007 5:34
REFERENCES 153
[88] A. Torralba, K. P. Murphy and W. T. Freeman, “Sharing visual features for multiclass
and multiview object detection,” in IEEE Conf. Computer Vision and Pattern Recognition
(CVPR), Washington DC, 2004.
[89] M. Veloso, S. Lenser, D. Vail, M. Roth, A. Stroupe and S. Chernova, “CMPack-02:
CMU’s legged robot soccer team,” Oct. 2002.
[90] M. Veloso, E. Pagello and H. Kitano, Eds., RoboCup-99: Robot Soccer World Cup III.
Berlin: Springer Verlag, 2000.
[91] R. Zhang and P. Vadakkepat, “An evolutionary algorithm for trajectory based gait
generation of biped robot, in Proc. Int. Conf. Computational Intelligence, Robotics and
Autonomous Systems, Singapore, 2003.
robotics Mobk082 July 9, 2007 5:34
154
robotics Mobk082 July 9, 2007 5:34
155
Biography
Dr. Peter Stone is an Alfred P. Sloan Research Fellow and Assistant Professor in the
Department of Computer Sciences at the University of Texas at Austin. He received his
Ph.D. in 1998 and his M.S. in 1995 from Carnegie Mellon University, both in Computer
Science. He received his B.S. in Mathematics from the University of Chicago in 1993. From
1999 to 2002 he was a Senior Technical Staff Member in the Artificial Intelligence Principles
Research Department at AT&T Labs - Research.
Prof. Stone’s research interests include planning, machine learning, multiagent systems,
robotics, and e-commerce. Application domains include robot soccer, autonomous bidding
agents, traffic management, and autonomic computing. His doctoral thesis research contributed
a flexible multiagent team structure and multiagent machine learning techniques for teams
operating in real-time noisy environments in the presence of both teammates and adversaries.
He has developed teams of robot soccer agents that have won six robot soccer tournaments
(RoboCup) in both simulation and with real robots. He has also developed agents that have won
four auction trading agent competitions (TAC). Prof. Stone is the author of “Layered Learning
in Multiagent Systems: A Winning Approach to Robotic Soccer” (MIT Press, 2000). In 2003,
he won a CAREER award from the National Science Foundation for his research on learning
agents in dynamic, collaborative, and adversarial multiagent environments. In 2004, he was
named an ONR Young Investigator for his research on machine learning on physical robots.
Most recently, he was awarded the prestigious IJCAI 2007 Computers and Thought award.
robotics Mobk082 July 9, 2007 5:34
156