You are on page 1of 1

Priyanshu Agarwal MAE

Executive Summary
Priyanshu Agarwal
Professor: Dr. K. English

Problem Summary: In this work, we propose a method to estimate the pose of human
lower limbs from a single image without any user assistance. Our method is based on the fact
that if an image from a 3D model of a human can be generated, a metric can be established
by comparing this model generated image with the original image. We use image subtraction
as the means to evaluate this metric for measuring the accuracy of the estimated pose and
minimize it using optimization.
Solution Approach: The system takes an image from the data set and extract the
silhouette corresponding to the lower limbs of the human body. The area of the silhouette is
then evaluated and the model z coordinate is changed such that the difference in the are of
the model generated silhouette and the actual human silhouette is minimized. The centroid
of both the silhouettes is then located and optimized to lie as close as possible. Finally, the
optimization of the limb pose is carried out such that the absolute ares of the two subtracted
silhouette images is minimized. Once a solution is obtained the pose of the lower limbs is
swapped and the problem is solved once again to obtain a solution with the two legs now
interchanged in pose. Both the poses are then saved as the probable poses. The overall
all system consists of three optimization subroutines. We run the optimization routines
iteratively and minimize the three objective functions sequentially. We choose to optimize
the functions sequentially because the nature of the problem is such that the optimization of
individual functions does not pose contradictory requirements on the design variables. We use
Method of Multipliers for solving the optimization problem for ND constrained optimization.
In order to solve the ND unconstrained optimization subproblem, we use Powell’s conjugate
direction method as it does not require any derivative information of the objective function.
For 1D optimization subproblem we employ Golden section with Swann’s bounding.
Results: The presented optimization based framework is a good way for estimating
human pose from a single image. This technique provides most probable poses i.e. capi-
talizes on finding the local minimums of a multi-modal function. Most techniques present
today require some user input, but the approach presented in this work alleviates all such
requirements. However, computation time to arrive at a solution was observed to be large
for certain images ( few hours on a standard PC with a 3Gb RAM) in which it took some
time to converge. We believe more efficient implementation of the subroutines in C/C++
can reduce the solution time.
Future Work: In future using some technique that incorporates the edges in the original
image, as a metric, can be used to finally choose one pose out of the two probable ones. This
can also be achieved by using inference or move limits from previous pose in case a sequence
of images are being processed. Also, the technique can be extended for images in which
the human limbs are oriented in 3D. In addition, the problem can be extended for the full
human body. Also, an optimization framework can be setup to optimize the geometry of
the model simultaneously in case a video i.e. sequence of images are being processed. This
adaptability in the model shape will result in more plausible pose estimation.