Professional Documents
Culture Documents
Download textbook Advanced Concepts For Intelligent Vision Systems 18Th International Conference Acivs 2017 Antwerp Belgium September 18 21 2017 Proceedings 1St Edition Jacques Blanc Talon ebook all chapter pdf
Download textbook Advanced Concepts For Intelligent Vision Systems 18Th International Conference Acivs 2017 Antwerp Belgium September 18 21 2017 Proceedings 1St Edition Jacques Blanc Talon ebook all chapter pdf
https://textbookfull.com/product/advanced-concepts-for-
intelligent-vision-systems-19th-international-conference-
acivs-2018-poitiers-france-september-24-27-2018-proceedings-
jacques-blanc-talon/
https://textbookfull.com/product/advanced-concepts-for-
intelligent-vision-systems-17th-international-conference-
acivs-2016-lecce-italy-october-24-27-2016-proceedings-1st-
edition-jacques-blanc-talon/
https://textbookfull.com/product/advanced-hybrid-information-
processing-first-international-conference-adhip-2017-harbin-
china-july-17-18-2017-proceedings-1st-edition-guanglu-sun/
https://textbookfull.com/product/wireless-and-satellite-
systems-9th-international-conference-wisats-2017-oxford-uk-
september-14-15-2017-proceedings-1st-edition-prashant-pillai/
https://textbookfull.com/product/advanced-information-systems-
engineering-29th-international-conference-caise-2017-essen-
germany-june-12-16-2017-proceedings-1st-edition-eric-dubois/
https://textbookfull.com/product/database-systems-for-advanced-
applications-22nd-international-conference-dasfaa-2017-suzhou-
china-march-27-30-2017-proceedings-part-i-1st-edition-selcuk-
Jacques Blanc-Talon · Rudi Penne
Wilfried Philips · Dan Popescu
Paul Scheunders (Eds.)
Advanced Concepts
LNCS 10617
for Intelligent
Vision Systems
18th International Conference, ACIVS 2017
Antwerp, Belgium, September 18–21, 2017
Proceedings
123
Lecture Notes in Computer Science 10617
Commenced Publication in 1973
Founding and Former Series Editors:
Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen
Editorial Board
David Hutchison
Lancaster University, Lancaster, UK
Takeo Kanade
Carnegie Mellon University, Pittsburgh, PA, USA
Josef Kittler
University of Surrey, Guildford, UK
Jon M. Kleinberg
Cornell University, Ithaca, NY, USA
Friedemann Mattern
ETH Zurich, Zurich, Switzerland
John C. Mitchell
Stanford University, Stanford, CA, USA
Moni Naor
Weizmann Institute of Science, Rehovot, Israel
C. Pandu Rangan
Indian Institute of Technology, Madras, India
Bernhard Steffen
TU Dortmund University, Dortmund, Germany
Demetri Terzopoulos
University of California, Los Angeles, CA, USA
Doug Tygar
University of California, Berkeley, CA, USA
Gerhard Weikum
Max Planck Institute for Informatics, Saarbrücken, Germany
More information about this series at http://www.springer.com/series/7412
Jacques Blanc-Talon Rudi Penne
•
Advanced Concepts
for Intelligent
Vision Systems
18th International Conference, ACIVS 2017
Antwerp, Belgium, September 18–21, 2017
Proceedings
123
Editors
Jacques Blanc-Talon Dan Popescu
DGA CSIRO Data 61
Paris Canberra, ACT
France Australia
Rudi Penne Paul Scheunders
University of Antwerp University of Antwerp
Antwerp Wilrijk
Belgium Belgium
Wilfried Philips
Ghent University - imec
Ghent
Belgium
LNCS Sublibrary: SL6 – Image Processing, Computer Vision, Pattern Recognition, and Graphics
These proceedings gather the selected papers of the Advanced Concepts for Intelligent
Vision Systems (ACIVS) conference, which was held in Antwerp, Belgium, during
September 18–21, 2017.
This event was the 18th ACIVS. Since the first event in Germany in 1999, ACIVS
has become a larger and independent scientific conference. However, the seminal
distinctive governance rules have been maintained:
– To update the conference scope on a yearly basis. While keeping a technical
backbone (the classic low-level image processing techniques), we have introduced
topics of interest such as – chronologically – image and video compression, 3D,
security and forensics, and evaluation methodologies, in order to fit the conference
scope to our scientific community’s needs. In addition, speakers usually give invited
talks on hot issues.
– To remain a single-track conference in order to promote scientific exchanges among
the audience.
– To grant oral presentations a duration of 25 minutes and published papers a length
of 12 pages, which is significantly different from most other conferences.
The second and third items entail a complex management of the conference; in
particular, the number of time slots is rather small. Although the selection between the
two presentation formats is primarily determined by the need to compose a well-
balanced program, papers presented during plenary and poster sessions enjoy the same
importance and publication format.
The first item is strengthened by the notoriety of ACIVS, which has been growing
over the years: Official Springer records show a cumulated number of downloads on
August 1, 2017, of more than 550,000 (for ACIVS 2005–2016 only).
The regular sessions also included a couple of invited talks by Professor
Antonis Argyros (University of Crete), Professor Daniel Cremers (Technische
Universität München), and Professor Sanja Fidler (University of Toronto). We would
like to thank all of them for enhancing the technical program with their presentations.
ACIVS attracted submissions from many different countries, mostly from Europe,
but also from the rest of the world: Algeria, Australia, Austria, Brazil, Belgium,
Canada, China, Cyprus, Czech Republic, Denmark, Ecuador, France, Finland,
Germany, Hungary, India, Israel, Italy, Japan, Korea, Mexico, The Netherlands,
Poland, Romania, Russia, Switzerland, Taiwan, Tunisia, Turkey, the Ukraine, United
Arab Emirates, the UK, and the USA.
From 134 submissions, 38 were selected for oral presentation and 25 as posters. The
paper submission and review procedure was carried out electronically and a minimum
of three reviewers were assigned to each paper. A large and energetic Program
Committee (67 people), helped by additional reviewers (86 people in total), as listed on
the following pages, completed the long and demanding reviewing process. We would
VI Preface
like to thank all of them for their timely and high-quality reviews, achieved in quite a
short time and during the summer holidays.
Also, we would like to thank our sponsors, Antwerp University, Commonwealth
Scientific and Industrial Research Organization (CSIRO), and Ghent University, for
their valuable support.
Finally, we would like to thank all the participants who trusted in our ability to
organize this conference for the 18th time. We hope they attended a stimulating sci-
entific event and that they enjoyed the atmosphere of the various ACIVS social events
in the city of Antwerp.
As mentioned, a conference like ACIVS would not be feasible without the concerted
effort of many people and the support of various institutions. We are indebted to the
local organizers Rudi Penne and Paul Scheunders, for having smoothed all the rough
practical details of an event venue, and we hope to welcome them in the near future.
Steering Committee
Jacques Blanc-Talon DGA, France
Rudi Penne University of Antwerp, Belgium
Wilfried Philips Ghent University - imec, Belgium
Dan Popescu CSIRO Data 61, Australia
Paul Scheunders University of Antwerp, Belgium
Organizing Committee
Rudi Penne University of Antwerp, Belgium
Paul Scheunders University of Antwerp, Belgium
Program Committee
Hojjat Adeli Ohio State University, USA
Syed Afaq Shah The University of Western Australia, Australia
Hamid Aghajan Ghent University - imec, Belgium
Fabio Bellavia Università degli Studi di Palermo, Italy
Jenny Benois-Pineau University of Bordeaux, France
Jacques Blanc-Talon DGA, France
Philippe Bolon University of Savoie, France
Egor Bondarev Technische Universiteit Eindhoven, The Netherlands
Adrian Bors University of York, UK
Salah Bourennane Ecole Centrale de Marseille, France
Catarina Brites Instituto Superior Técnico, Portugal
Vittoria Bruni University of Rome La Sapienza, Italy
Dumitru Burdescu University of Craiova, Romania
Tiago J. Carvalho Instituto Federal de São Paulo - Campinas, Brazil
Giuseppe Cattaneo University of Salerno, Italy
Jocelyn Chanussot Université de Grenoble Alpes, France
Gianluigi Ciocca University of Milano Bicocca, Italy
Eric Debreuve CNRS, France
Stéphane Derrode Ecole Centrale de Lyon, France
Véronique Eglin INSA, France
Jérôme Gilles San Diego State University, USA
Georgy Gimel’farb The University of Auckland, New Zealand
Daniele Giusto University of Cagliari, Italy
VIII Organization
Additional Reviewers
Wim Abbeloos KU Leuven, Belgium
Syed Afaq Shah The University of Western Australia, Australia
Hamid Aghajan Ghent University - imec, Belgium
Gianni Allebosch Ghent University - imec, Belgium
Yiannis Andreopoulos University College London, UK
Danilo Babin Ghent University - imec, Beglium
Fabio Bellavia Università degli Studi di Palermo, Italy
Jenny Benois-Pineau University of Bordeaux, France
Jacques Blanc-Talon DGA, France
Nyan Bo Bo Ghent University - imec, Belgium
Boris Bogaerts University of Antwerp, Belgium
Philippe Bolon University of Savoie, France
Egor Bondarev Technische Universiteit Eindhoven, The Netherlands
Adrian Bors University of York, UK
Salah Bourennane Ecole Centrale de Marseille, France
Catarina Brites Instituto Superior Técnico, Portugal
Vittoria Bruni University of Rome La Sapienza, Italy
Dumitru Burdescu University of Craiova, Romania
Tiago J. Carvalho Instituto Federal de São Paulo - Campinas, Brazil
Giuseppe Cattaneo University of Salerno, Italy
Jocelyn Chanussot Université de Grenoble Alpes, France
Gianluigi Ciocca University of Milano Bicocca, Italy
Matthew Dailey Asian Institute of Technology, Thailand
Eric Debreuve CNRS, France
Nikos Deligiannis VUB, Belgium
Stéphane Derrode Ecole Centrale de Lyon, France
Simon Donné Ghent University - imec, Belgium
Jérôme Gilles San Diego State University, USA
Georgy Gimel’farb The University of Auckland, New Zealand
Daniele Giusto University of Cagliari, Italy
Bart Goossens Ghent University - imec, Belgium
Junzhi Guan Ghent University - imec, Belgium
David Helbert University of Poitiers, France
Michael Hild Osaka Electro-Communication University, Japan
Mark Holden Kyoto University, Japan
Kazuhiro Hotta Meijo University, Japan
Dimitris Iakovidis University of Thessaly, Greece
Syed Islam Edith Cowan University, Australia
Yuji Iwahori Chubu University, Japan
Arto Kaarna Lappeenranta University of Technology, Finland
Ali Khenchaf ENSTA Bretagne, France
Marek Kraft Poznan University of Technology, Poland
Wenzhi Liao Gent University, Belgium
Céline Loscos Université de Reims Champagne-Ardenne, France
X Organization
Human-Computer Interaction
Statistical Radial Binary Patterns (SRBP) for Bark Texture Identification . . . . 101
Safia Boudra, Itheri Yahiaoui, and Ali Behloul
Extracting Relevant Features from Videos for a Robust Smoke Detection. . . . 406
Olfa Besbes and Amel Benazza-Benyahia
Image Processing
Vision Based Lidar Segmentation for Scan Matching and Camera Fusion. . . . . 627
Gabriel Burtin, Patrick Bonnin, and Florent Malartre
3D Shape from SEM Image Using Improved Fast Marching Method. . . . . . . 735
Lei Huang, Yuji Iwahori, Aili Wang, and M.K. Bhuyan
1 Introduction
Markerless human pose estimation is essential in a large range of applications
including human-computer interaction, video surveillance, video games, virtual
reality, gait analysis, rehabilitation, and intelligent houses. The vast majority
of markerless human pose estimation systems are camera-based. The preference
goes towards depth cameras rather than color cameras. Indeed, the latter are
sensitive to illumination conditions and textures, and we need more than one
color camera to avoid the problem of scale ambiguity. Since the release of the
Microsoft Kinect camera and the associated pose estimation method [21], depth
cameras have reached an affordable price and, at the same time, pose estimation
has reached a new level of accuracy and robustness using only one camera.
However, current methods require a huge set of learning samples to train
the pose estimation model. The reason is that pose estimation from a single
depth map in an unconstrained environment (i.e. where the person can take
any orientation with respect to the camera) is very complex. Indeed, a pose
estimation method has to implicitly determine the orientation of the person
while predicting the 3D location of the body joints. In previous works [14,15,21],
c Springer International Publishing AG 2017
J. Blanc-Talon et al. (Eds.): ACIVS 2017, LNCS 10617, pp. 3–14, 2017.
https://doi.org/10.1007/978-3-319-70353-4_1
4 S. Azrour et al.
a single machine learning model was learned to handle this task all at once which
explains the need of an enormous amount of learning data to obtain a model
invariant to the orientation. For instance, in [21], Shotton et al. used 2 billion
data samples to learn their model. Despite this, their results showed that there
was still a significant potential for accuracy improvement but they were limited
by the size of their model. Furthermore, the noticeable difference in a depth
image of a person seen from the back or facing the camera can be very small and
the simple features used in [21] to describe the surroundings of a pixel seems
unlikely to be efficient to disambiguate the orientation. Some global features
would clearly be more appropriate for the orientation estimation task.
Following this observation, the current paper extends a previous work by
Azrour et al. [2], that introduced the idea of using an orientation estimate to
improve the accuracy of the pose estimation and validated it on synthetic data.
Here, we go further by evaluating, on the public dataset UBC3V, the complete
pipeline composed of an orientation estimation method followed by a multiple
model pose estimation method, with each model designed for a different range
of orientations. Splitting the process in two different steps allows to use adapted
features and methods for each task and our results show that it requires signifi-
cantly less samples to reach the same or better pose estimation accuracy.
2 Related Work
Markerless human pose estimation systems estimate the pose of a human subject
without the need of any device or marker attached to his body. At the expense of
a loss of accuracy compared to marker-based systems, the markerless systems are
simpler and not invasive which is mandatory, for instance, in sport and medical
applications where the movement must be natural and not altered by some
markers. Indeed, there is a risk that people could be disturbed and distracted
by the markers they wear, leading to a different motion.
Markerless pose estimation systems can be classified in different categories
depending on the type (color or depth), the number of cameras used, and the
process used to recover the pose. The pose estimation process can be based on a
body model tracking method (generative method), on a more direct prediction of
the pose based on the input image (discriminative method), or on a combination
of the two (hybrid method).
The most challenging, in terms of the accuracy of the estimated pose, is to use
a unique color camera. Previous works addressing this task include poselets [4]
or mixture-of-parts model [26]. More recently, convolutional networks have also
been used to perform this task [23,24].
Thanks to a set of calibrated color cameras, we can avoid the problems of
scale ambiguity inherent to the use of a single color camera. A great contribution
was made by Corazza et al. [6,7]. In [6], they built upon the work of Anguelov
et al. [1] to design a subject-specific human body model. This model was used
to track the reconstructed 3D volume of the subject (or visual hull [16]) using
an articulated iterative closest point (ICP) algorithm [7]. Thanks to their body
A Two-Step Methodology for Human Pose Estimation Increasing 5
model that can fit the body shape of any subject, their method is able to reach
a very high accuracy. With their work, the problem of markerless human pose
estimation using multiple color cameras is considered as mostly solved by some
experts of the domain [22].
Using depth instead of color cameras offers many advantages for the pose
estimation. Indeed, the use of depth images solves the scale ambiguity of the RGB
domain and they are much more invariant to the lightening condition and the
texture. Therefore, it becomes possible to create robust and sufficiently accurate
systems using only one depth camera. A major contribution was proposed by
Shotton, Girschick et al. [13,20,21] where they accurately predict in real-time
the 3D positions of body joints from a single depth image, without using any
temporal information. To achieve this, they used a random forest model learned
from a large, realistic, and highly varied synthetic set of training images.
Body model tracking methods were also used with a single depth camera
[9,10]. However, these methods are less robust (they are sensitive to local min-
ima) and they are generally slower than discriminative methods (as [21]). Com-
bining a body model tracking method with a discriminative method allows to
recover from local minima and to benefit from the time coherence of the tracking
which lead to robust and accurate systems [25].
In the specific domain of human pose estimation from a single depth image
(i.e. where no temporal information is used) two recent contributions were pro-
posed by Jung et al. [14,15]. These two methods can be seen as variants of the
method proposed by Girschick et al. [13]. In [14], Jung et al. estimate the joints
locations by “walking” through the depth image and predicting the direction
toward a given joint at each step. The joints are recovered sequentially by tak-
ing advantage of the human skeleton structure. In this way, they significantly
increased the speed compared to [13] (1, 000 frame per second (fps) on a single
CPU against 200 fps on a parallelized implementation) while being even more
accurate. In [15], they proposed a two-step method where they first estimate the
3D locations of the joints and then label them. They claim that this procedure
leads to a significant increase of the accuracy compared to the state-of-the-art
methods. However, the dataset used to evaluate these two methods, the EVAL
dataset [10], is only 10 cm accurate (mentioned by the authors and confirmed by
a visual inspection of the data). Moreover, this dataset contains depth images
of subjects performing the same sequence of movements where they are facing
the camera most of the time. Therefore, further experimentations with a dataset
containing more precise groundtruths and more diversified poses should be used
to assess more reliably the accuracy of these algorithms.
Recently, a new dataset (UBC3V) was released by Shafaie et al. [19]. This
dataset was built in a similar way as the one used by Shotton et al. [21]. It
can be used to design pose estimation methods using up to 3 depth cameras.
The markerless pose estimation domain was really lacking such a publicly avail-
able dataset. Therefore, it is an important contribution that allows to compare
new methods on a same reliable basis. In [19], Shafaie et al. also proposed a
multiple depth cameras pose estimation method. In each camera, they used a
6 S. Azrour et al.
convolutional network to predict a body part label for each pixel following [20].
Then, they merged the results from all the cameras in a common 3D space and
estimated the joints locations based on the 3D point cloud. Their method out-
performs state-of-the-art pose estimation methods using multiple depth cameras.
3 Our Method
Fig. 1. Orientation convention and ranges of orientations used for the four-model
method in the experiments. Each model is learned using poses with orientations
included in a sector of 110◦ and consecutive sectors have an overlap of 20◦ .
To take into account the orientation estimate in the pose estimation process,
we follow what was done in our previous work [2]. Instead of learning a single
model M to deal with the whole range of orientation O = [0◦ , 360◦ ], we learn n
models Mi designed for different ordered ranges of orientation Oi , such that:
It follows from Eq. (4) that a correct pose estimation model will always be
selected if the error on the orientation estimate is bounded by 2l degrees.
Separating the orientation estimation from the pose estimation has two main
advantages: (1) it allows to use different and more appropriate features for the
orientation estimation, and (2) it greatly facilitates the task of the pose estima-
tion model which allows to reach higher accuracy with less training data.
This general methodology can be used with various orientation and pose
estimation methods. For the purpose of the experiments, we selected state-of-
the-art methods for these two tasks. These methods are explained below.
forest inference problems [8]. This framework offers a great flexibility and can
be adapted to a wide range of inference problems. This is achieved thanks to
interfaces that have to be implemented by the user and that define how the
framework interacts with the data.
A forest is an ensemble of decision trees, each composed of split and leaf
nodes. In our implementation, we trained one forest for each body joint j
separately. The goal of the regression forest, for a given joint j and a given
pixel p, is to estimate the offset p→j going from the pixel 3D location
x(p) = (x(p), y(p), z(p)) to the 3D location of the body joint j. At test time,
this is performed for all the silhouette’s pixels and the predicted offsets are added
to the pixel 3D locations to produce a set of estimates for the 3D location of the
body joint j. The final proposal is given by the first mode (obtained with the
mean shift algorithm) of the distribution of these estimates.
The learning data used to train the tree structure include a set of depth
images, the 3D locations of the body joints in these images, and a set of pix-
els S = {pi } sampled from these images. The features used to describe the
surroundings of a pixel are simple depth comparisons [20]. For a given pixel
p = (x, y) and two 2D offsets Δ = (δ 1 , δ 2 ), the feature f is defined as:
δ1 δ2
f (p, Δ) = z p + −z p+ , (6)
z(p) z(p)
where z(p) is the depth at pixel p. The offsets are divided by the depth at the
pixel p to achieve depth translation invariance.
At each split node, multiple 2D offsets Δ and thresholds τ are randomly
sampled from uniform distributions. Each pair φ = (Δj , τj ) induces a partition
of the set of pixels S at the parent node into left and right subsets S L (φ) and
S R (φ) such as:
S L (φ) = {pi | f (pi , Δj ) < τj } , (7)
S R (φ) = S \ S L (φ). (8)
The best φ is selected according to:
|S L (φ)| |S R (φ)|
φ∗ = argmin I(S L (φ)) + I(S R (φ)), (9)
φ |S| |S|
where |.| denotes the cardinality, and the objective function I(S) is computed as
the variance of the offsets going from each pixel in the set S to the body joint j.
In each leaf, based on the pixels that reached this leaf during training, we
compute and store the mean offset.
4 Experiments
4.1 Dataset
To learn and test our proposed method, we used the dataset UBC3V introduced
by Shafaei and Little [19]. This dataset is composed of synthetic depth images
A Two-Step Methodology for Human Pose Estimation Increasing 9
of realistic human characters with various poses for training and evaluation of
single or multiview depth-based pose estimation methods. The data generation
procedure was very close to the one used by Shotton et al. [21]. The human char-
acters were generated using the free and open source sofware MakeHuman and
the poses were taken from the CMU motion capture database [5]. Such a pub-
lic dataset with realistic and various poses and precise groundtruths was really
something missing in the markerless pose estimation community. Hopefully, this
will allow to perform a reliable evaluation of the existing and future methods
and to compare their performances on a same basis.
Before that, the EVAL dataset [10] was frequently used to evaluate pose esti-
mation methods [14,15,27]. However, in the EVAL dataset, the human models
are facing the camera most of the time and performing the same sequence of
movements. Moreover, according to the author of the dataset [10,11] and con-
firmed by visual inspection, the precision of the groundtruth 3D joints locations
is about 10 cm. That is a problem to evaluate realiably the accuracy of pose
estimation methods. The new synthetic dataset UBC3V fixes these problems by
covering the whole orientation range and by providing precise groundtruths.
The UBC3V dataset consists of three sub-datasets with varying complexity.
The first one (easy-pose) contains one human character in various standing pose.
The second one (inter-pose) still contains only one human character but with a
more varied set of poses, like, for instance, upside-down poses, lying poses, etc.
Finally, the third and last dataset (hard-pose) includes several human characters
with different physical shapes. For each instance of a human character and a
pose, three cameras are randomly placed in a virtual scene and depth images are
rendered from these viewpoints. Moreover, each of the three mentioned datasets
includes train, validation, and test sets with mutually disjoint poses.
In this paper, we only used the first dataset (easy-pose) because it is the only
one that is compatible with a meaningful definition of the orientation. Indeed,
for all the poses where the human character is not in a standing position, it
is not possible to reliably define an orientation. Nonetheless, this is not a big
limitation for our method because, in most applications, the human subject is
in a standing position.
cos(Δθ) = (cos θ, sin θ) • (cos θ̂, sin θ̂) = cos(θ̂ − θ), (10)
10 S. Azrour et al.
and therefore
Δθ = cos−1 (cos(θ̂ − θ)). (11)
We used 100k images from the train set to learn the ExtRaTrees model with
100 trees and default parameters [12]. The learned model was then evaluated
on 10k images from the test set. For each image, the orientation estimate was
obtained using Eq. (5). Then, Eqs. (10) and (11) were used to determine the error
Δθ on each image. The mean error Δθ on the 10k test images was 6.2◦ , which is
consistent with the result reported in [17]. It is worth mentioning that, with the
learned models, an error of more than 20◦ is made on less than 5% of the poses in
the test set. This may lead, in rare cases, to the choice of a wrong pose estimation
model if we work on single depth images with no use of temporal information.
However, in practice, as mentioned in [17], a light temporal filtering can be used
to avoid these rare bigger errors. Such a filtering has not been implemented in
this work as we aim at estimating the pose from single images.
To have a basis of comparison for our approach, we learned and evaluated the
pose estimation method without using an estimation of the orientation of the
person seen by the camera. In this case, for each considered body joint, a single
model has to deal with the whole range of orientations [0◦ , 360◦ ]. The machine
learning model has to implicitly learn the small clues related to the orientation
in the depth image while predicting the pose at the same time. It is quite a hard
task when the seen person can take any orientation. As we will see in the next
section, an orientation knowledge can substantially ease this task.
To build the learning set, given the available computational resources, 20k
images were picked from the train set of the easy-pose dataset and 1k pixels were
randomly sampled from each of them. Then, a forest of three trees (as in [21])
was learned for each body joint separately. Finally, we evaluated the learned
models on 10k test images.
We analyzed 8 body joints: head, shoulder, elbow, wrist, hip, knee, and ankle.
For each considered body joint, the accuracy of the corresponding model is given
by the mean Euclidean distance error of the predictions with respect to the
groundtruth 3D body joint locations. The results are given in the first column
of the Table 1.
Table 1. Mean Euclidean distance errors on the positions of the considered body joints
for different numbers of models and learning dataset sizes. We can see that even with
ten times less training data (i.e. 500 images for each of the four models instead of
20k for a single model), the four-model method outperforms the one-model method for
almost all the body joints.
Amount of models: 1 4 4
Learning images per model: 20, 000 20,000/4 = 5, 000 500
Range of each model: 360◦ 360◦/4 + 2 × 10◦ = 110◦ 360◦/4 + 2 × 10◦ = 110◦
Mean error Head 2.6 cm 2.3 cm (−11.5%) 2.6 cm (−0%)
Shoulder 3.7 cm 2.9 cm (−21.6%) 3.2 cm (−13.5%)
Elbow 8.9 cm 6.3 cm (−29.2%) 7.8 cm (−12.4%)
Wrist 13.9 cm 10.4 cm (−25.2%) 13.2 cm (−5.0%)
Hip 3.4 cm 2.5 cm (−26.5%) 3.1 cm (−8.8%)
Knee 6.3 cm 4.6 cm (−27.0%) 6.4 cm (+1.6%)
Ankle 11.2 cm 8.1 cm (−27.7%) 10.8 cm (−3.6%)
Mean 7.1 cm 5.3 cm (−25.4%) 6.7 cm (−5.6%)
improvement that is obtained from the shrinkage of the orientation ranges when
we increase the number of models and the accuracy deterioration due to the
reduction of the learning set size per model.
With the selected orientation ranges, when the orientation estimate doesn’t
fall in the overlap of two sectors, a correct pose estimation model is always
selected (using Eq. (4)) if the error on the orientation estimate is smaller than
20◦ . However, when the orientation estimate falls in the overlap of two sectors,
the risk of selecting a wrong pose estimation model is higher. Here, we can ensure
that a correct pose estimation model will be selected for any orientation estimate
if the error on the orientation estimate is bounded by 10◦ which is equal to half
of the overlap.
To make a fair comparison with the one-model method, we used the same
total amount of data, i.e. a total of 20k images with 5k images for each of the
four models. The results on 10k test images are given in the second column of
Table 1. We can see a significant accuracy improvement for all the body joints
with respect to the one-model method.
Figure 2 illustrates the influence of the learning dataset size on the mean
euclidean error for the most challenging joint (wrist). The three curves corre-
spond, respectively from to bottom, to the one-model method, the four-model
method when we know the exact orientation, and the four-model method when
we use an orientation estimate. As noticed above, the accuracy improvement for
a constant learning dataset size is significant when we use an orientation esti-
mate. We also see that the accuracy obtained with the orientation estimate is a
little bit worse than the one obtained knowing the exact orientation. This means
that a wrong pose estimation model is selected for some poses in the test set.
This difference can be reduced by a using bigger overlaps between consecutive
sectors at the expense of a loss of accuracy.
12 S. Azrour et al.
Fig. 2. Mean Euclidean error on the wrist for the one-model method (blue curve), the
four-model method using the exact orientation (red curve) and the four-model method
using the predicted orientation (green curve). (Color figure online)
5 Conclusion
In this paper, we evaluated a novel two-step markerless pose estimation method
on the public pose estimation dataset UBC3V. The first step of this method
consists in estimating the orientation of the person seen by the camera. Then,
this orientation estimate is used to dynamically choose an appropriate pose
estimation model. Our results show that this methodology significantly improves
the accuracy when a constant learning set size is used. Moreover, we show that
using an orientation estimate allows to reach a better accuracy even when the
learning set size is reduced by a factor greater than 10.
References
1. Anguelov, D., Srinivasan, P., Koller, D., Thrun, S., Rodgers, J., Davis, J.: SCAPE:
shape completion and animation of people. ACM Trans. Graph. 24(3), 408–416
(2005)
2. Azrour, S., Piérard, S., Van Droogenbroeck, M.: Leveraging orientation knowl-
edge to enhance human pose estimation methods. In: Perales, F., Kittler, J. (eds.)
AMDO 2016. LNCS, vol. 9756, pp. 81–87. Springer, Cham (2016). https://doi.org/
10.1007/978-3-319-41778-3 8
3. Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using
shape contexts. IEEE Trans. Pattern Anal. Mach. Intell. 24(4), 509–522 (2002)
4. Bourdev, L., Malik, J.: Poselets: body part detectors trained using 3D human pose
annotations. In: IEEE International Conference Computer Vision (ICCV), Kyoto,
September/October 2009, pp. 1365–1372 (2009)
5. Carnegie Mellon University. Motion capture database. http://mocap.cs.cmu.edu
6. Corazza, S., Gambaretto, E., Mundermann, L., Andriacchi, T.: Automatic gen-
eration of a subject-specific model for accurate markerless motion capture and
biomechanical applications. IEEE Trans. Biomed. Eng. 57(4), 806–812 (2010)
7. Corazza, S., Mündermann, L., Gambaretto, E., Ferrigno, G., Andriacchi, T.: Mark-
erless motion capture through visual hull, articulated ICP and subject specific
model generation. Int. J. Comput. Vis. 87(1–2), 156–169 (2010)
8. Criminisi, A., Shotton, J. (eds.): Decision Forests for Computer Vision and
Medical Image Analysis. Springer, Heidelberg (2013). https://doi.org/10.1007/
978-1-4471-4929-3
9. Gall, J., Stoll, C., De Aguiar, E., Theobalt, C., Rosenhahn, B., Seidel, H.-P.: Motion
capture using joint skeleton tracking and surface estimation. In: IEEE International
Conference Computer Vision and Pattern Recognition (CVPR), pp. 1746–1753
(2009)
10. Ganapathi, V., Plagemann, C., Koller, D., Thrun, S.: Real-time human pose track-
ing from range data. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid,
C. (eds.) ECCV 2012. LNCS, vol. 7577, pp. 738–751. Springer, Heidelberg (2012).
https://doi.org/10.1007/978-3-642-33783-3 53
11. Ganapathi, V., Plagemann, C., Thrun, S., Koller, D.: Real time motion capture
using a single time-of-flight camera. In: IEEE International Conference Computer
Vision and Pattern Recognition (CVPR), San Francisco, CA, USA, pp. 755–762,
June 2010
12. Geurts, P., Ernst, D., Wehenkel, L.: Extremely randomized trees. Mach. Learn.
63(1), 3–42 (2006)
13. Girshick, R., Shotton, J., Kohli, P., Criminisi, A., Fitzgibbon, A.: Efficient regres-
sion of general-activity human poses from depth images. In: IEEE International
Conference Computer Vision (ICCV), Barcelona, Spain, pp. 415–422, November
2011
14. Jung, H., Lee, S., Heo, Y., Yun, I.: Random tree walk toward instantaneous 3D
human pose estimation. In: IEEE International Conference Computer Vision and
Pattern Recognition (CVPR), pp. 2467–2474, June 2015
15. Jung, H.Y., Suh, Y., Moon, G., Lee, K.M.: A sequential approach to 3D human
pose estimation: separation of localization and identification of body joints. In:
Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp.
747–761. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46454-1 45
Another random document with
no related content on Scribd:
D 28
103 July
2914 Jordan D W
B 5
July
3499 Johnson D 45 I
18
45 July
3510 Jennings H
G 18
July
3885 Jones Wm 55 C
24
July
4057 John Thomas 54 E
27
July
4093 Jones J 79 A
27
50 Aug
4540 Johnson J W
G 2
103 Aug
4590 Jameson Wm
H 3
101 Aug
4817 Johns Rob’t
I 5
Aug
5295 Johnson H Art 2 I
11
150 Aug
5516 Jacobs B G
F 13
100 Aug
5871 Jones Rob’t
A 16
101 Aug
6197 Jones T
I 19
Aug
6200 Jones W E 27 B
19
49 Aug
6317 Jones S
G 22
145 Aug
6760 Joslin J
I 25
Aug
6817 Jober J 77 B
25
6931 Jarmter C 7A Aug
26
53 Sept
7566 Johnson Chas
G 2
Sept
8318 Johnson J 45 I
10
101 Sept
8853 Jolly Jas
H 15
Sept
9303 Jones P 63 F
20
149 Sept
9351 Jordan J M
D 20
Sept
9378 Jacobs J S Cav 6F
20
Sept
9982 Jeffries C 4B
29
101 Sept
9999 Jones T
B 29
Oct
10735 Jabin Jas 55 E
11
Oct
10987 Jones A 27 D
16
184 Oct
11058 Johnson Wm
D 17
148 Oct
11430 Jordan Thos
- 24
115 Oct
11539 Jenks J C
H 27
118 Nov
12007 Johnson L
C 4
Dec
12331 Jack J P 7E
24
103 July
2889 Johnson A G, Cor
I 4
2 Kelly Chas H 71 H Mch
1
Mar
238 Kelly H S, S’t Cav 13 H
30
Mar
266 Kuntzelman J 63 E
31
May
1024 Kenny Wm 12 F
11
June
1824 Kyle Wm 5H
10
June
1875 Kelly Peter 73 -
12
June
2076 Knight Jno Cav 7K
17
June
2335 Kehoe Moses 8H
22
June
2639 Kenoan M A Cav 14 L
29
July
3048 King C 6C
8
July
3187 Kiech N, Cor 54 A
12
101 July
3265 Klink A
C 13
103 July
3471 Kemp E
A 17
103 July
3634 Keeston E
I 20
July
4162 Kagman J T 45 B
28
July
4293 Kuffman S D 45 E
30
Aug
4545 Kauf J Art 2B
2
148 Aug
4895 Kelley O F
B 6
5058 Kock H 21 H Aug
8
Aug
5145 Kawell Jno Cav 18 E
9
Aug
5154 Keyes Alex C, Cor Cav 16 H 64
9
149 Aug
5208 Kester L
F 10
Aug
5443 Kelley T Cav 13 H
12
Aug
5851 Kahn R 96 K
13
103 Aug
5718 Keister Jno M
A 15
Aug
5744 Keeley Wm Cav 13 A
15
Aug
6028 Kauffman B F 45 K
18
Aug
6084 Kemper J 73 D
18
Aug
6459 Kiger Wm Cav 3C
22
Aug
6497 Kenter A W 67 B
22
184 Aug
6514 Kniver S
F 22
Aug
6638 Krigle H 11 K
23
Aug
6965 Krader W O 55 H
27
Aug
7005 King M Cav 3A
27
Aug
7372 Keller A 9M
31
7553 Keller M 105 Sept
G 1
118 Sept
7781 Kyle Wm
F 7
184 Sept
8210 Kinsman F P
F 8
Kanford Jno C, S Sept
8734 C 5 -
m 14
Sept
8799 Kaufman J 45 E
17
Sept
9139 Kipp W Cav 12 D
18
145 Sept
9563 Kinmick T, Cor
K 23
Sept
9630 Kearney L 50 F
24
149 Oct
10335 Kerr B
B 4
101 Oct
10367 Kirby J A
E 5
184 Oct
10439 Kline Ross
F 6
152 Oct
10502 Kennedy J
A 8
Oct
10698 King M 11 K
11
101 Oct
10747 Kirkwood H
C 11
Oct
10926 Knieper C 89 F
14
Oct
11238 Kurtz J 55 K
21
Oct
11332 King J R 55 K
23
Oct
11384 Kelley E Cav 7F
24
11463 King R 6E Oct
26
116 Oct
11645 Kramer Geo, Cor
G 30
184 Feb
12695 Knox J, S’t 65
A 23
July
8676 Kerer H N 63 E 64
20
Mch
88 Liesen Lewis Cav 13 A
21
Mch
243 Lancaster E “ 14 F
30
April
297 Luck W “ 11 H
1
April
549 Lynch Adam “ 6L
14
May
1403 Levy Frank “ 3H
27
May
1429 Liesine Wm, Cor 13 E
28
June
1579 Lindine J Art 3A
3
106 June
1588 Little M
F 3
145 June
1621 Luhaus Melter
A 4
183 June
2250 Lackey Jas
D 21
June
2379 Leach J Cav 3D
23
July
3091 Larimer J 11 E
9
July
3734 Ladbeater Jas 7K
21
3305 Link P 98 H July
14
118 July
3306 Long A
H 14
July
3369 Lanigan N, S’t Cav 13 L
15
101 July
3403 Lewis Ed
I 16
49 July
3448 Leonard Geo
G 17
July
3489 Logan B 90 B
17
July
3545 Lee Jas Cav 13 B
18
101 July
4312 Long D F B
I 30
July
4434 Lambert W Cav 4K
31
Aug
4696 Larrison Wallace C 14 C
4
Aug
4818 Lewis A Cav 3D
5
101 Aug
4857 Laughlin J, S’t
E 6
Aug
4907 Lahman C 73 C
6
Aug
4929 Livingston J K 2B
6
Aug
5199 Long Augustus 55 H
10
Aug
5225 Loudin H N 14 H
10
116 Aug
5314 Lacock Hugh
E 11
Aug
6252 Lodiss H 90 A 64
20
6636 Leach Jas 49 E Aug
23
143 Aug
6783 Light S, Cor
H 25
Aug
7145 LaBelt J 21 F
29
Sept
7938 Lemon Jno E Cav 4 I
6
145 Sept
7950 Lockhard J
B 8
103 Sept
8405 Lepley Chas
E 10
Sept
8754 Layman F 49 B
10
Sept
8833 Laughlin J L 1H
15
Sept
8895 Lester W H Cav 7 I
16
Sept
8904 Lippoth J 5E
16
Sept
9085 Logne S 26 A
18
Sept
9291 Leary C 83 K
19
Sept
9647 Loden J Cav 4C
24
110 Sept
10066 Laytin P
D 30
21 Sept
10086 Lutz P M
G 30
116 Sept
10091 Lebos C
D 30
140 Oct
10273 Limar W
- 3
10298 Long W 67 Oct
G 4
Oct
10372 Long P, Cor Cav 11 C
5
119 Oct
10548 Lancaster C
B 8
Oct
10572 Lynch W J Cav 3 I
9
Oct
10580 Labor R 7F
10
143 Oct
10687 Luchford R
F 11
110 Oct
10873 Lang I
C 13
Oct
11604 Leuchlier J 5 -
16
Oct
11255 Lantz Wm 7C
21
Oct
11465 Lewis J Cav 4L
26
Nov
11728 Luther I “ 4L
1
Nov
11869 Lego Geo 12 A
6
53 Nov
11907 Ladd A
M 7
Nov
12192 Lape J 18 K
28
Dec
12210 Lewis D S 53 K
2
77 Jan
12489 Linsey D 65
G 19
139 Aug
5699 Ledwick F M 64
C 15
Aug
7084 Latchem David Cav 4K
28
7307 Lochery A “ 14 E Aug
30
Aug
5985 Logan W 97 A
17
101 Aug
6030 Loudon S
A 18
181 Aug
6053 Layton Samuel
A 18
Aug
6071 Lamb C 71 B
18
Aug
6082 Lane Amos Cav 6E
18
Aug
6152 Lehnich Jno Art 2F
19
April
753 Lenard M Cav 13 D
26
141 April
761 Lord G W
E 27
Loudon Samuel, May
871 2F
Cor 4
105 Mar
183 Maynard Jno
G 27
Mar
208 Missile Val 47 C
28
Mar
225 Miller Daniel Cav 13 H
29
April
361 Martin J F “ 14 K
2
April
461 McEntire W 51 F
9
April
538 Mine Josh, Cor 54 F
14
April
586 Marple S L 14 A
17
605 McKissick Jno 23 F April
18
April
667 Myers G Cav 1E
22
April
736 McKeever E L, S’t 71 F
25
April
773 McDonald R 23 C
28
April
780 McCarthy Jas Cav 18 E
28
May
969 McQueeny W 79 B
9
May
1006 Meyer Jno Cav 2E
10
May
1128 McKey J “ 1 I
15
May
1139 McMahon J 73 F
16
May
1147 McKnight J E 57 B
16
May
1151 McHale J Cav 14 D
16
May
1185 Moser Jno “ 13 B
18
May
1273 McCollen W, S’t “ 4L
22
May
1287 Milligan J 61 F
22
May
1308 McCartney M 73 B 64
23
May
1460 Murray Jno Cav 13 E
29
June
1586 Miles Lewis “ 4 I
3
13 June
1643 Myers J R, Cor “
M 5
1722 Marshall M M 78 E June
8
103 June
1748 Moyer Thos
E 9
118 June
1792 Miller M
A 10
June
1858 McHose J Cav 4A
12
June
1907 Miller Henry 8G
13
101 June
1982 Muchollans J
K 15
June
2056 Monny W H Cav 3A
16
101 June
2058 Matchell J J
K 16
101 June
2159 Monan J
C 19
June
2265 McCutchen J Cav 4C
21
June
2278 Milton Wm “ 19 H
21
June
2333 Myers F, S’t 27 H
22
76 June
2364 Myers Peter
G 23
June
2388 Morton T 79 I
24
June
2409 McCabe J Cav 3L
24
103 June
2411 McKay M J
B 24
June
2493 Merry Jas 67 E
26
2503 Martin A J, Cor Cav 4E June
26
June
2508 Morris J “ 18 A
26
June
2653 McManes —— 77 B
29
101 June
2684 Mipes J
B 30
77 June
2690 Morris G
G 30
July
2798 Marsh D 50 D
2
July
2831 McCane Chas 14 C
3
July
3017 McRath J 48 C
7
July
3065 Morris Calvin 53 D
9
McCalasky J E, July
3133 Cav 4K
S’t 10
July
3151 Mattiser B 57 F
11
149 July
3172 Madden Daniel
G 11
103 July
3250 Myers M
E 13
July
3374 Mink H Art 3A
16
155 July
3467 Meaker E N
H 17
101 July
3481 McKeon Jno
H 17
138 July
3488 Mihan J
D 17
July
3939 Maroney Jno Cav 1D
20
3690 McCarron J Cav 4A July
21
116 July
3766 Myers Jno
D 22
July
3971 Martin G 45 I
25
July
4016 McDermott J M 70 F
26
103 July
4123 McGee Jas
I 28
July
4197 Moore M G Art 1A
29
July
4341 Marquet M 6M
30
100 July
4407 McKever Jno
A 31
July
4414 McFarland Jas 55 E
31
101 Aug
4546 Moan Jas
K 2
Aug
4607 Martin Bryant 7F
3
Aug
4635 McKeral Jas 14 K
3
Mathews C W, 145 Aug
4710
Cor B 4
Aug
4734 Moore 71 I
4
Aug
4796 McDevitt J Art 3D
5
Aug
4824 Miller H Cav 14 I
5
150 Aug
4876 Mills Wm
G 6
4898 Muldany M 96 K Aug
6
103 Aug
5068 Martain Jno
E 8
103 Aug
5069 Measler Jas
E 8
McCaffrey Jno, h Aug
5139 Art 3A
s 9
Aug
5159 Martin C Cav 8A
9
103 Aug
5266 Marey H F
F 10
14 Aug
5291 Mohr J R
G 11
101 Aug
5415 McCarty Dennis
K 12
Aug
5433 McGee J 14 H
12
Aug
5595 Mickelson B Cav 16 B
14
Aug
5642 McClough L C 18 C
14
101 Aug
5704 Miller Jno 64
G 15
Aug
5723 McCann Jno Art 3A
15
143 Aug
5781 Miller S
B 15
Aug
5809 Montgomery R 62 A
16
Aug
5868 McQuillen A Art 6L
16
Aug
5893 McCuller S Cav 4B
16
Aug
5926 Mulchey J A 50 D
17
5988 Mann Jas, Cor 119 Aug
G 17
103 Aug
6014 McPherson D
F 17
103 Aug
6038 Moore C
G 18
Aug
6148 McCracker J 53 K
19
Aug
6294 McLaughlin Jas C 4A
20
Aug
6441 McWilliams H 82 I
22
103 Aug
5480 Martin Jno
D 22
Aug
6532 McGan J Cav 18 -
23
144 Aug
6664 McKee ——
C 24
Aug
6689 Manner M 73 K
24
143 Aug
6910 McGlann H
B 26
Aug
6925 McGuigan H C 7K
26
143 Aug
7026 Marks P
B 27
107 Aug
7061 Moore M J
- 28
Aug
7107 Moyer Wm H 55 H
28
Aug
7119 Miller Jno L 53 K
28
Aug
7127 McAffee Jas 72 F
28
7175 Moore Thos 69 D Aug
29
Aug
7263 Martin Jno 77 C
30
Aug
7265 Musser Jno 77 D
30
103 Aug
7305 Moser S
E 30
183 Aug
7333 Morris Jno
G 30
Aug
7407 Marchin Wm 50 E
31
Sept
7512 Millinger Jno H 7C
1
103 Sept
7602 Moorhead J S
D 2
Sept
7719 Myers H 9A
3
Sept
7875 Mayer W 8M
5
103 Sept
7925 Mays N J
H 5
Sept
8027 Murphy A Cav 13 I
6
Sept
8047 McKnight J Cav 18 I
6
101 Sept
8122 Miller J, Cor
C 8
145 Sept
8123 Mullings W
G 8
Sept
8128 Munager W Cav 13 L
8
Sept
8134 Mehaffey J M Cav 16 B
8
Sept
8153 McCantley W Art 2A
8
8158 McLane T 12 E Sept
8
119 Sept
8194 McKink J, Cor
D 8
101 Sept
8216 Mansfield J
G 8
118 Sept
8322 Myers A
I 10
103 Sept
8469 Magill H
I 11
146 Sept
8596 Morrison J
E 12
Sept
8627 McKinney D 90 C
13
118 Sept
8691 Moritze A
D 14
101 Sept
8802 McCullogh ——
E 15
Sept
9071 Maynard A Art 3 -
17
Sept
9090 McCall Wm Cav 22 B
18
138 Sept
9228 McCullough S
K 19
Sept
9270 Mayhan F Cav 20 -
19
149 Sept
9315 Marsh W
K 20
138 Sept
9339 Meyers J A
C 20
101 Sept
9526 McQuigley Jno
C 22
184 Sept
9583 Mead H J
B 23
9598 Martin J Cav 17 C Sept
23
Sept
9644 Morris J 54 I
24
Sept
9646 Morgan J E 2A
24
118 Sept
9651 McCook B
A 24
Sept
9761 McMurray Wm C 1 I
25
112 Sept
9871 Mason Jno
A 27
Aug
4578 McKerner S 73 E
2
Sept
10050 Mesin Jas, S’t 90 F
30
Sept
10060 Morgan C 45 A 64
30
101 Oct
10119 McClany J
C 1
Oct
10154 McElroy Wm Cav 13 L
1
Oct
10306 Meese J 48 A
4
Oct
10396 McGraw Jno Art 3A
6
Oct
10407 Miller H 79 K
6
Oct
10486 Miller Wash’gton C 18 C
7
118 Oct
10610 McKearney J W
K 10
Oct
10620 McClief Wm 7A
10
118 Oct
10641 Marker W H
D 10
10678 Martin J P 7 I Oct
11
Oct
10684 Miller Jas 7 I
11
138 Oct
10803 Mattis Aaron
- 12
Oct
10825 Moore C H Cav 13 C
13
108 Oct
10929 Martin Geo H
I 14
Oct
10981 Maxwell S Cav 14 B
15
Oct
10991 Moses W “ 16 H
16
118 Oct
10993 McKnight Jas
K 16
Oct
11081 Mitchell J O 55 H
18
101 Oct
11142 Mansfield Geo
I 19
Oct
11229 McClay J H Cav 11 D
20
Oct
11305 McBride —— “ 2H
22
184 Oct
11326 Marshall L
A 23
101 Oct
11387 Moore S
F 24
Oct
11459 Moore J Cav 13 B
25
100 Sept
11464 McNelse J H, Cor
E 26
Oct
11542 Miller F 54 K
27
11655 Midz J Cav 20 A Oct