You are on page 1of 22

# Who is who at different

cameras:
people re-identification
using depth cameras

## Beln Castilln Fernndez de Pinedo

Outline

1. Introduction

2. System description

2.1
2.2
2.3
2.4
2.5

Kinect sensor
Camera calibration
Heigh maps
Segmentation
Tracking

3. Bodyprints
3.1 Extraction
3.2 Matching people

5. Conclusions

1. Introduction

## Goal: Obtain a feature vector per person,

which is called bodyprint, this vectors can be
matched to solve the re-identification
problem. Bodyprints are obtained using
calibrated depth-colour cameras such as
Microsoft kinect.

## Main problem: In multi-camera systems is

difficult to re-identify people who leave one
camera and enter in another or in the same
one after a period of time.

2. System description

## Aligned RGB and depth

images obtained with kinect.

## Has a colour RGB and an

infrared (IR) camera. It also has
an IR pattern generator that
jointly with the IR camera is
able to determine the depth.

## Segments people and it is able

to estimate their position.

## The sensor has a minimum

distance to measure depth
around 1 m, and a maximum
distance of about 10 m. (only
for indoor)

## 2.2 Camera calibration

To change from
camera coordinates to
world coordinates.

Spatial camera
coordinates equations
for every pixel:

These equations
provide 3D
coordinates

## The system has the origin in the optical center of

the camera and its aligned with the camera axis.

## To determine the ground plane just select a

portion of the RGB-d image that correspond
to the ground plane. The Zworld axis will be
normal to this plane.

## They are images where pixel values

represent the height with respect to the
ground.

2.4 Segmentation

## Segmentation provides us information of

which pixels in the original image belong to
each particular person.

2.5 Tracking

## Process of linking the segmentation results

from several frames.

corresponding to the same person.

3. Bodyprints

## Key idea: To match people, we extract a feature

vector per track which we call bodyprint. Each
bodyprint summarises the colour appearance at
a different height for a track.

cm.

## At time t, the mean RGB value for each given

height is computed to obtain the temporal
signatures.

3. Bodyprints

3.1 Extraction

## We obtain bodyprints by averaging the

temporal signatures along time:

## Where the bodyprint vector is RGBk (h), it

describes the appareance of the person.
The count vector is Ck (h) and measures
the reliability of the values of the bodyprint.

## To compare bodyprints we propose a

normalized weighted correlation coefficient.

## If we want to compare bodyprints j and k we

use W(h) that allows to compare bodyprints
with missing values (like occlusions):

## We compute a weighted mean for each

track, which is used to compensate changes
in brightness and finally the correlation:

## Experiment 1: people recorded by camera 1

are searched across some videos recorded
by camera 2.

## One camera captures people entering into a

shop and another one captures people at
the exit. (front1-front2 and rear1-rear2
views)

obtained is 93%.

## 4. Results and discussion

Front-front example

Rear-rear example

## Example of a wrong match: the correct

match had the second highest correlation
coefficient and it was very similar to the
highest (0.87345 and 0.87212).

the same camera.

## The key difference compared with the

previous experiment is that frontal and rear
views are now compared.

## The average correct re-identification

obtained in this experiment drops to 55%.

## Problems in re-identification: presence of

logos on T-shirts, backpacks

5. Conclusions

## The method has proved to be robust against

differences in illumination, point of view and
momentary partial occlusions.

Errors:
Similar appearance of two different people.
Different appearance of the same person from the point of
view of each camera.

Solutions:
More complex models can be used.
Models that take into account the relative angular position
with respect to the person axis.