You are on page 1of 4

Computers ind, Engng Vol. 19, Nos I-4, pp. 215,-218, 1990 0360-8352/90 $3.00 + 0.

00
Printed in Great Britain. All rights reserved Copyright © 1990 Pergamon Press plc

• -D Object Pose Determination UsingComputer Vision

Paul M. Griffin
School of Industrialand Systems Engineering
Georgia Instituteof Technology
Atlanta, Georgia 30332--0205

ABSTRACT mined since an uncountable number of 3--D scenes can


produce the same 2--D image. In order to obtmn 3--D
This paper is concerned w/t/~ deter'mining the information using computer vision, multiple images of the
position and orientaiton of a three--dimensional object scene must be taken. Shared points between the multiple
~ i n 9 computer viaion. An algorithm is presented Inhich images provides range data. The problem of finding the
uses an enclosing object and the central moments /or an shared points within these images is called stereo corrc~,--
intia[ estimate. A refined estimation is achieved by a pondence.
nonlinear least squares technique. We assume that multiple views of a scene may be
obtained by the robot moving the camera to different
INTRODUCTION positions. The environment for two camera views is
shown in Figure 1. A point P with world coordinates
The principal goal of robot vimon is to increase (zw,ltw,Zw) is projected onto the right image plane at
the flexibility with which a robot can interact with its coordinate (Zr,ltr) and the left image plane at coordinate
environment. For instance, it is desirable to give the (Zl,/tl). If the focal lengths/r and fl are known and the
robot the ability to recognise objects and to determine the rotational and translational offsets between the image
position and orientation (or pose) of the object so that it planes are known, then the world coordinate may be
can determine how the object should be picked up. In determined by the inverse projection of the two image
this paper, we present a methodology for the determin- coordinates through their corresponding focal points.
ation of the object pose for general 3--D objects using
computer vision.
M a n y researchers have addressed the problem of
the determination of object pose. Chang [3J presents a
method for for 2--D objects using feature masks, and gives
a good overview of other 2--D techniques. Other resear- P

chers have proposed methods for 3--D objects, but require


the object to be polyhedral [5], [8] and [14]. Haralick et
el. [12] provide methods for both 2--D and 3--D pose
problems in which the object repre~ntation is a point
pattern by using s least---~quarestechnique.
W e present a methodology which allows pose
determination of general---shaped 3--D objects. No
requirements are made that the object be polyhedral or a
point pattern. The method is also quite efficientsince it
reduces the number of parameters which must be solved
to three in the worst case by using an enclosing object
Figure I. Environment for 2 Camera System
technique. The method has an additional advantage in
that no time need be spent in preprocessing of the vision
information such as for finding edges or computing other This problem is complicated by the fact that all
features. It is only necessary that range data be provided. imaging systerrm are subject to error. Typical sources of
Three issues are addressed in this paper, The error include lens distortion, spatial and intensity quanti--
tint issue is how 3--D information is obtained about a sation, and noise. For this reason, the inverse projections
do not necessarilyintersect.
scene from 2--D projections. The second issue is what is
an appropriate object representation to help reduce the There -,re many proposed solutions to this
complexity of the pose determination problem. The third problem. The method used in the research presented this
issue is how to determine the object pose given the 3--D paper is discussed in Griffin [9]. This method formulates
information and the object representation. Each of these the problem as a bipartite matching problem where the
objective is to find the maximal matching of minimum
issues are addressed in the following sections.
weight. Due to the fact that all imaging systems are
subject to error, the results of the matching give
STEREO CORRESPONDENCE
inaccurate range data. W e assume that each point Pi
Computer vision provides 2--D information about given in the range data will lie with uniform distribution
a scene. For this reason, the 3--D world is underdeter-- within a sphere of radius r of the actual point qi. It is
important to note, however, that the pose determination

215
216 Proceedings of the 12th Annual Conference on Computers & Industrial Engineering

algorithm is not dependent on how the range information Superquadrics have some very useful properties
was obtained. for computer vision. First, superquadrics have a well
defined inside--outside function. Given a point in space,
OBJECT REPRESENTATION the inside--~utside function were the point lies relative the
the surface of the superquadric. For the superellipse:
One of the goals of this research is to develop a
pose determination methodology which can be used for
rr, 1~/~ r~ 1~/6~ ~ / "
objects of general shape. In order to achieve this goal, it
is important to use a general object representation. Many
different object representations for computer vision have
= [LL J +L J

been proposed. A good overview may be found in Chin


and Dyer [4]. Pentland [16] argues that most models used
r, ] ~/"1"
+ t~J J
in computer vision are too local to describe the gross
structure of scenes, and that it is important to have If F(z,l,z) = 1 then (z,y,z) is on the superquadric surface,
descriptive power at several levels of granularity. To if ~z,l/,z) < 1 then (z,14z) is inside of the superquadric
achieve this, Pentland makes use of a parametric r e p r e - surface, and if l;~z,7/,z) > 1 then (z,¥,z) is outside of the
sentation called superquadrics. superquadric surface.
Superquadrics are a collection of smooth p a r a - A second useful property of superquadrics is that
metric shapes developed by Danish designer Piet Hein [7]. the normal vector of any point on the surface may be
There are four rn~n classifications: the superellipsoid, the expressed in closed form. For the superellipse, the normal
Jupertoroid, the superhyperboloid of one sheet and the vector is:
superhyperboloid of two sheets. The superellipsoid is the
most useful shape for computer vision purposes and will

=
be the only classification discussed in this paper. A
discussion of the other representations may be found in
[1]. The surface of the superellipsoid is defined by the n07,¢) 1/aB co s ;~'-'E1( Ti)sin ~'-6 :g(W)
spherical product of the sine--codne and secant--tangent
curves. The 3--D vector form is: s/~s sin~-6z(~)

• (~,~) =
aS cos6S(e)cos6~(w)
a~ cos61(Tl)sin6g(W) .
l Barr [1,2] has developed a set of angle preserving
transformations for superquadrics. The transformations
include quadratic bending, linear tapering and cavity
deformation. These transformations increase the modeling
a3 sinEl(~) capability of superquadrics. For details, the reader is
referred to Barr [1,2]. Figure 3 shows example super--
quadrics with these transformations applied.
--~'/2 ~ ~] ~ ~'/2 While superquadrics are quite descriptive in
themselves, they are not general enough to describe 3--D
objects with arbitrary shapes. Pentland [16] argues that
The variables al, a~ and a3 are the scale parameters
superquadrics can be put together like lumps of clay to
which determine the length along the z, 11 and z axes
define complex objects. We use a relational graph in
respectively. The variables 61 and 6~ are called the
order to achieve this.
shape parameters. The shape parameters define the
A relational graph G is a four--tuple (V,E,Lv,Le),
roundness of the curve. Figure 2 shows the 2--D version
where V is a n o n - e m p t y vertex set, E is an edge set, and
of the superellipsoid curve for varying values of e.
Lv and L e are the vertex and edge labels respectively.
Each element of V is a superquadric primitive. If there is
a relationship between superquadrics then this is expressed
in E. The vertex label, Lv, is a record which specifies the
type of superquadric, the shape and scale parameters, and
the absolute positional parameters. The edge label, Le,
describes the relationship between superquadric primitives
in terms of Boolean combinations (e.g. or, and, differ-
ence operations).

OBJECT POSE DETERMINATION

In order to determine the object pose, seven


parameters must be determined. There are three
rotational parameters (0x,0y,0z), three translational
parameters (Tx, Ty, Tz), and one scale parameter S.
Given range information for an observed object O ' and
database information for the reference object O the object
pose problem is stated as: find transformation Tr for
which

O = Tr.O'

Figure 2. 2--D Superquadrics for Varying 6


Griffin: 3-D Pose Determination Using Computer Vision 217

qq
el
Figure 3. 3--D Superquadric Examples

where:
In order to determine the minimum enclosing
1 spheres, we used an extension of the E]singa--Hearn a l g o -
R I R.Tt rithm [6]. The E]singa--Hearn algorithm finds the mini--
m u m enclosing circle for the 2--D problem by choosing
T~--

J
I. . . . , points which define increasingly larger circles. The
0 0 0 1 1 algorithm has a worst-case complexity of O(n2) where n
is the number of points. We have modified the Elsinga--
Hearn algorithm to work in 3--D. The modified version
c0zC0y cO:sOysO~-~OzcOxcO~sOycOx+sO:sOx has the same worst case complexity. Details may be
found in [i0].
R= s0zc0y *OzaOysOx+cOzcOxsOzaOycOx-cOzlOx In order to determine the object pose, the follow--
ing steps are taken:
-sOy coy s 0x c0yC0x
1. Determine the minimum enclosing spheres for
both the database and observed objects.
T=[T~ Ty T~I,
2. Align the spheres at their centers.
where c a = co.(a) and sa = sin(a).
One way in which the rotational and translations] 3. Make an initialassignment of 0x, 0y and 0z
offlmts can be determined is by a nonlinear least squares by using the matrix of central moments [13].
estimation technique which utilizes the inside--outside
function of the superquadric. Messimer [15] presents a 4. Use a nonlinear least squares estimation
two-ttage method using nonlinear least squares e s t i m a - technique to compute the final estimates of the
tion for 2--D parts. This method can easily be extended rotational offsets.
to 3--D parts. However, for the 3--D case, there are six
parameters which must be solved for instead of one. To The nonlinear least squares estimation technique
overcome this problem, we make use of an enclosing mentioned in step 4 is an extended version of the 2--D
object. method presented in Messimer [15]. Details of the
In order to define the translational offsets, some approach may be found in [10].
reference point for the object must be defined. W e define A couple of points should be mentioned about the
this reference point as the center of the minimum enclos--- approach. First, the slowest part of the approach occurs
ing sphere which contains the object. The minimum at step 4. It is possible to speed up this step, however, by
enclosing sphere is formulated as a minimax problem. For subMmpling the pointsprovided in the rangedata. In
a set of points P, where each Pi E P is defined by position fact, when the iterations are first started, a small number
vector <zi,~i,zi>, we want to minimize the function: of points can be used, and as the solution starts to
converge, more points can be added for refinement.
Another possibility is that for most applications, the
/(a,b,c) = max[(zi--,)~+(yi--b)~
+(zi--c)~II~. object will be in one of only a few stable positions. This
i
is due to the fact that for most applications, the object is
sitting of a fiat surface. If the object is in a stable
For the observed object, the point set P is given by the position, then only one rotational parameter need be
range data. For the database object, however, this point solved for. If the set of stable positions is fairly small,
set must be specified. These can either be specified then the process can be repeated for each element in the
explicitly by the user when the object is being entereu m set. Also, the initialrotational offset can determine what
the database, or can be determined by defining a step size the stable position is. Having to solve for only one rota-
for varying values of r] and w. tional parameter greatly speeds up the process.
218 Proceedings of the 12th Annual Conference on Computers & IndustrialEngineering

It is po~ible that the minimum enclosing sphere REFERENCES


of the database object has a radius R which differs from
the the radius Ro of the observed object. Due to the I. A.H. Barr, Superquadrics and angle--preserving t r a n s -
inaccuracies in the range data, the radius of the minimum formations, IEEE Computer Graphics and Applications, 1,
enclosing sphere of the observed object must fall in the 11--23 (1981).
bound:
2. A.H. Burr, Global and local deformations of solid
Roe JR+ r,R-d primitives, Computer Graphics, 18, 21--30 (1984).

where r is the radius defined in the a~umption made 3. C.A. Chang, 3. Goldman and J.M. Pan, Part posi-
about the bound on each range point. If this bound is tioning with feature masks for computer vision systems,
violated then not there is one of two po~ibilities. Either lIE Transactions, 19, 182--189 (1987).
not enough range data k available for pose determination
in which c u e more data must be collected, or the 4. R.T. Chin and C.R. Dyer, Model--basod recognition in
observed object differs for the database object. robot vision, Computing Surrey, 18, 68--108 (1986).
One final point needs to be discussed. As
mentioned earlier, Pentlsnd [16] argues that the object 5. M. Dhome, M. Richetin, J.T. Lapreste, and G. Rives,
Ihould be represented at different levels of granularity. Determination of the attitude of 3 - D objects from a
Both Pentland [16] and Sollna and Bajcsy [17] have single perspective view, IEEE Transactions on Pattern
developed a technique to determine the shape and scale Anal#sis and Machine Intelligence, 11, 1265--1278 (1989).
parameters for a superquadric. In this case the object is
not described by superquadrlc primitives,but by a single 6. 3. Elzinga and D.W. Hearn, Geometric solutions for
superquadric. The method presented will stillwork in some minimax location problems, Transportation Science,
this case. In fact,any levelof granularityis acceptable as 6, 379--394 (1972).
long as the object is represented by some set of super--
quadrlcs. 7. M. Gardner, The superellipse: a curve that lles
between the ellipse and the rectangle, Scientific American
CONCLUSIONS 213, 224--234 (1985).

This paper has presented a method to determine 8. S.J. Gordon and W.P. Seering, Real--time part p o s i -
the object pose of 3 - D objects. The method require8 no tion sensing, IEEE Transactions on Pattern Analysis and
preproceuing steps such as the determination of edges or Machine Intelligence, 10, 374--388 (1988).
other higher moments. The method is also applicable to
3 - D objects of general shapes due to the superquadric-- 9. P.M. Griffin, Correspondence of 2--D projections by
based repre~ntation used. Finally, by the use of an bipartite matching, Pattern Recognition Letters, 9, 361--
enclosing object, the method reduces the number of 308 (1989).
parameters which must be solved for the nonlinear least
~uares estimation. I0. P.M. Griffin, Determination of object pose using
range data, Working Paper, School of Industrial and
Systems Engineering, Georgia Institute of Technology,
Atlanta, G A (1990).

ACKNOWLEDGEMENT 11. W.E.L. Grimson and T. Losano--Peres, Model--based


recognition and localization from sparse range or tactile
The author gratefullyacknowledges Chuan Ju Su data, International Journal of Robotics Research, 3, 3-35
for providing Figure 3. (1984).

12. R.M. Haralick, H. JoG, C.N. Lee, X. Zhuang, V.G.


Valdya and M.B. Kim, Pose estimation for corresponding
point data, IEEE Transactions on Systems, Man, and
Cybernetics, 19, 1426--1448 (1989).

13. B.K.P. Horn, Robot Vision, MIT Preu, Cambridge,


MA (1986).
14. R.. Krishnapurex~ and D. Casasent, Determination of
three--dimensional object location and orientation from
rax~ge images, IEEE Transactions on Pattern Analysis and
Machine Intelligence, 11, 1158--1167 (1989).

15. S.L. Messimer, Superquadric--based part identifica-


tion and tolerancing, in Proceedings of l ~ h Computers
and Industrial Engineering Conference, Orlando, F1
(1990).

16. A.P. Pentland, Perceptual organization and the repre-


sentation of natural form, Artificial Intelligence, 28,
293--331 (1988).

17. F. Solina and R. Bajcay, Recovery of parametric


models from range images: the case for superqu~Irics
with global deformations, IEEE Transactions on Pattern
Analysis and Machine Intelligence, 12, 131--147 (1990).

You might also like