14 views

Uploaded by amit_per

- 01115319
- Practicum 7 STUDYgui
- Hybrid Data Clustering Approach Using
- Document Clustering Using Particle Swarm Optimization
- Retrieval of Textual and Non-textual Information In
- Genetic
- A Two Step Clustering Method for Mixed Categorical and Numerical Data
- Bent Ez 2016
- 3. Project Report
- An efficient enhanced k-means clustering algorithm
- An Efficient Approach for Multi-Target Tracking in Sensor Networks using Ant Colony Optimization
- Illustration of Medical Image Segmentation based on Clustering Algorithms
- GA Clustering
- 04 LEC Data Science Kmeans
- Clustering Student Data to Characterize Performance Patterns
- 16. Extraction of Tumor and Cancer Cells of Brain MRI Images by using different Morphological Operations.pdf
- Review on Automatic and Customized Itinerary Planning Using Clustering Algorithm and Package Recommendation for Tourism Services
- Back Propagated K-Mean Clustering for Prediction of Slow Learners
- ICAET-T1-14-172
- Research on Web Session Clustering

You are on page 1of 15

this way it is possible to represent color distribution

with a small number of component Gaussians,

however, building and updating MoGs via EM is

time-consuming.

The idea of the second category is histogram

matching. It usually needs a reference color model

of the object and a similarity measure to evaluate

the similarity between the reference and the

candidate color model. The candidate whose mode

is the most similar to the reference one is selected as

the tracking result in the current frame. The wellknown

mean shift algorithm [5] and the improved

work [23,1,4,22] that follows fall into this category,

in which the color model was represented by a

weighted histogram (kernel-based probability distribution),

and the similarity was measured with

Bhattacharrya distance. By the first order gradient

descent of the similarity measure, the mean shift

algorithm is derived with which the local best

candidate is achieved.

The method proposed in this paper belongs to the

second class, aiming at solving two problems in the

algorithms. Conventional histogram methods

[5,23,1,4,22] partition the whole color space of the

object into regular square tessellation, neglecting the

fact that object color is usually very compact and

distributed only in some small regions of the whole

color space, thus leading to a large number of void

bins and a waste of computational resources. The

second problem is that in each bin the ample color

information is not modelled, discarding the distribution

of the multi-channel gray level.

To address the two problems, a clustering-based

color model is proposed and a fast algorithm

based on Integral Images is developed for object

tracking. In Section 2 K-means clustering is used

to partition the color space adaptively and the

histogram bins of the object model is determined

accordingly. Moreover, we model the multi-channel

gray level distribution in each bin with

Gaussian to capture a richer description of the

target. Then a similarity measure and its simplified

form based on Bhattacharrya distance is introduced

to evaluate the similarity between two color

models. In Section 3 the Integral Images for

computation of histogram, mean and variance

are proposed, with which the color model is able

to be evaluated with fast array index operation.

Thanks to the Integral Images it is possible to

implement efficiently the brute-force search tracking

algorithm. In Section 4 diverse experiments are

made to demonstrate the validity and the performance

of the algorithm.

2. Clustering-based color model

It is a common understanding that adaptive

binning histograms can represent the distributions

more efficiently and more accurately with much less

bins. Although adaptive partition of color space has

long been studied in image coding [6] and image

segmentation [2], few related work was found in

object tracking.

2.1. Adaptive partition of color space

In the paper K-means clustering [7] is employed to

adaptively partition the color space of the object.

According to the clustering result, the histogram

bins are determined using the following simple

methods. For each cluster, the pixel farthest to that

cluster center is used to determine bin range that is

non-uniform rectangle for two dimensions or hyperrectangle

for higher dimensions. Adjacent rectangles

(or hyper-rectangles) may have small overlapping

regions. For a pixel within such an overlapping

region, its identity is determined by computing its

distance to relevant cluster centers and selecting the

cluster with minimum distance.

Fig. 1 presents an example of adaptive partition

of color space. The left figure is a reference image of

a human face. The middle figure shows the color

distribution of the object in RG color space, from

which we can see color is very compact and

distributed only in some small regions of the whole

RG color space. The right figure shows nonuniform

histogram bins according to K-means

clustering (d ¼ 6), where pixels belonging to the

same bin are labelled with the same color.

Determination of the number of histogram bins

is an important yet unresolved problem in colorbased

object tracking [5,23,1,3,21,22]. Too many

bins fail to handle environment changes or noise

which leads to tracking failures, meanwhile too

few fail to allow a good discrimination of the

target color model, resulting in distraction by

similar color regions nearby. In our case, straightforward

application of clustering algorithms [8]

which handle automatic selection of cluster

number cannot yet solve the above problem.

Thus, like most color-based tracking algorithms,

the bin number is empirically set (between d ¼ 4

ARTICLE IN PRESS

L. Peihua / Signal Processing: Image Communication 21 (2006) 676–687 677

and 8 in our case) and selection of bin number

accounting for environment changes is left for

future work.

2.2. Color model and similarity measure

Based on the adaptive bins obtained above, given

a reference image consisting of a set of pixels

IðxiÞ; i ¼ 1; . . . ;N, the reference color model is

represented by p ¼ fpug; u ¼ 1; . . . ; d, where pu is

defined as

puðIðxÞ; bu; lu;RuÞ ¼ buGðlu; RuÞ. (1)

In the above equation, Gðlu;RuÞ is a Gaussian

distribution with mean vector lu and covariance

matrix Ru, and bu; lu; Ru are of the following forms:

bu ¼ nu=N,

lu ¼

1

nu

XN

i¼1

IðxiÞduðxiÞ,

Ru ¼

1

nu

XN

i¼1 ðIðxiÞ luÞðIðxiÞ luÞTduðxiÞ, ð2Þ

where nu ¼

PN

i¼1 duðxiÞ is the number of pixels

within the uth bin, and duðxiÞ is kronecker function

which is 1 if IðxiÞ falls into the uth bin and 0

otherwise.

Consider the color model q ¼ fqug; u ¼ 1; . . . ; d,

of a candidate region comprising of N0 pixels, in

which the component distribution has the form

quðIðxÞ; b0u; l0u;R0uÞ ¼ b0uGðl0u; R0uÞ, (3)

where b0u, l0u and R0u have similar forms as shown in

Eq. (2). Similarity between two component distributions

puðIðxÞ; nu; lu;RuÞ and quðIðxÞ; n0u; l0u;R0uÞ is

mR easured using Bhattacharrya distance rðpu; quÞ ¼

p

1=2

u q

1=2

u dIðxÞ. By integral we get

rðpu; quÞ ¼ cu exp 1

4 ðlu l0uÞT ðRu þ R0uÞ1

ðlu l0uÞ

, ð4Þ

where cu is given below:

cu ¼ ð2bub0uÞ1=2 jRuj1=2jR0uj1=2

jRu þ R0uj

1=2

. (5)

Thus, the similarity measure between two distributions

p ¼ fpug and q ¼ fqug is defined as

rðp; qÞ ¼

Xd

u¼1

rðpu; quÞ. (6)

2.2.1. Simplification of the color model

Assuming that gray level distribution of different

channel in each bin is independent of each other, the

covariance matrix becomes diagonal and similarity

measure can be simplified. Let lu ¼ ½mu;1 mu;2 mu;3T

and Ru ¼ diagfs2

u;1 s2

u;2 s2

u;3g, the similarity measure

between two component distributions, as described

by Eq. (4), is simplified as

rðpu; quÞ ¼ cu exp

1

4

X3

j¼1

ðmu;j m0u;jÞ2

s2u

;j þ s0u;j

2

!

, (7)

where cu has the following form:

cu ¼ ð2bub0uÞ1=2

Y3

j¼1

su;js0u;j

s2u

;j þ s0u;j

2

!1=2

. (8)

The advantage of such an assumption is that we can

evaluate the means and the variances in array index

ARTICLE IN PRESS

Fig. 1. Adaptive partition of color space. From left to right are: a reference i

mage (size: 73 69) of a human face, the histogram of the

reference model and non-uniform histogram bins in RG space.

678 L. Peihua / Signal Processing: Image Communication 21 (2006) 676–687

operations through the Integral Images described in

Section 3.1.

2.2.2. Remarks

A histogram models the probability pu of a pixel

pðIðxÞÞ falling into the uth bin. It is interesting to

compare different forms of the probability pu for

different definitions of histograms

pu ¼

bu for a traditional histogram;

buGðlu; RuÞ for a histogram proposed

in the paper:

8><

>:

A traditional histogram only counts the number of

pixels belonging to one bin, without modelling color

distribution within each bin, the assumption underlying

which is that all pixels within that bin are

uniformly distributed. As for the histogram proposed

in the paper, in addition to counting of the

pixel number, the distribution within each bin is

modelled as Gaussian.

3. Fast algorithm based on Integral Images for object

tracking

Exhaustive search via histogram comparison for

the maximal mode is computationally prohibitive in

real-time tracking applications. However, with the

Integral Images proposed below it is possible to

make a brute-force search.

Motivated by the work of Viola and Jones [19],

we presented a straightforward method to compute

histogram by introducing a concept of Integral

Histogram Image [20]. Porikli independently presented

the concept of Integral Histogram and

analyzed at length its computational complexity

[13]. In agreement with the methods above, the

histogram of any size of rectangle region can be

achieved with fast array index operations.

In the paper we use the methods introduced in

[20] to compute histogram. Furthermore, we extended

the work of Viola and Jones by presenting

Integral Images for computing the means and

variances of three channels in each bin.

3.1. Computation of color distribution through

Integral Images

Given the original color image Dðx; yÞ ¼

ðDj ðx; yÞ j ¼ 1; 2; 3Þ, we present Integral Images

Ibu ðx; yÞ, Imu;j ðx; yÞ and Isu;j ðx; yÞ, where

u ¼ 1; . . . ; d; j ¼ 1; 2; 3, for computation of histogram,

mean and variance of gray level for three

channels.

Assume the image Dðx; yÞ is of size M N pixels,

the corresponding Integral Image for histogram is an

array with ðM þ 1Þ ðN þ 1Þ rows and d columns.

The Integral Image Ibu ðx; yÞ at location ðx; yÞ corresponds

to the number of pixels that falls within the uth

bin above and to the left of ðx; yÞ in the image:

Ibu ðx; yÞ ¼

X

x0px;y0py

duðx0; y0Þ, (9)

where duðx0; y0Þ ¼ 1 if the pixel at location ðx0; y0Þ belongs to the uth bin, o

therwise duðx0; y0Þ ¼ 0. Using

the following pair of recurrences:

ibu ðx; yÞ ¼ ibu ðx 1; yÞ þ duðx; yÞ,

Ibu ðx; yÞ ¼ Ibu ðx; y 1Þ þ ibu ðx; yÞ; u ¼ 1; . . . ; d,

ð10Þ

where ibu ðx; 0Þ ¼ 0, Ibu ð0; yÞ ¼ 0 for any x and y, the

Integral Image for histogram can be computed in one

pass over the original image.

Given any rectangle, its histogram nuðu ¼ 1; . . . ; dÞ can be determined in 4d

array references

(see Fig. 2 and Eq. (11)) with Integral Histogram

Image for u ¼ 1; . . . ; d:

nu ¼ Ibu ðx þ w; y þ hÞ Ibu ðx þ w; yÞ

Ibu ðx; y þ hÞ þ Ibu ðx; yÞ, ð11Þ

where Ibu ðx; 0Þ ¼ Ibu ð0; yÞ ¼ 0, w and h are the width

and height of the rectangle, respectively.

The Integral Images for means and variances can

be defined as follows:

Imu;j ðx; yÞ ¼

X

x0px;y0py

duðx0; y0ÞDjðx0; y0Þ,

Isu;j ðx; yÞ ¼

X

x0px;y0py

duðx0; y0ÞDjðx0; y0Þ2,

u ¼ 1; . . . ; d; j ¼ 1; 2; 3. ð12Þ

ARTICLE IN PRESS

Fig. 2. Construction of Integral Image for histogram. On the left

is a rectangle with width w and height h, and on the right each

plane corresponds to one Integral Image plane of one bin.

L. Peihua / Signal Processing: Image Communication 21 (2006) 676–687 679

With the following two pairs of recurrences:

imu;j ðx; yÞ ¼ imu;j ðx 1; yÞ þ Djðx; yÞduðx; yÞ,

Imu;j ðx; yÞ ¼ Imu;j ðx; y 1Þ þ imu;j ðx; yÞ,

u ¼ 1; . . . ; d; j ¼ 1; 2; 3, ð13Þ

isu;j ðx; yÞ ¼ isu;j ðx 1; yÞ þ Djðx; yÞ2duðx; yÞ,

Isu;j ðx; yÞ ¼ Isu;j ðx; y 1Þ þ isu;j ðx; yÞ,

u ¼ 1; . . . ; d; j ¼ 1; 2; 3, ð14Þ

the Integral Images for means and covariances can

be computed in one pass over the original image.

Based on Eqs. (13) and (14), The mean and variance

for the jth channel and the uth bin can be obtained

in fast array index operations as below:

mu;j ¼

1

nu ðImu;j ðx þ w; y þ hÞ I mu;j ðx þ w; yÞ

Imu;j ðx; y þ hÞ þ I mu;j ðx; yÞÞ,

s2

u;j ¼

1

nu ðIsu;j ðx þ w; y þ hÞ I su;j ðx þ w; yÞ

Isu;j ðx; y þ hÞ þ I su;j ðx; yÞÞ m2

u;j ,

u ¼ 1; . . . ; d; j ¼ 1; 2; 3. (15)

3.2. Object tracking algorithm

The object shape is represented by a rectangle

which is allowed to move freely in the image plane

and to change width and height with the same scale.

Given the object location (position and size) in the

previous frame, exhaustive search is made seeking

the maximal mode in the neighboring region, the size

of which is two times of the object size.

To adapt to scale variation, the object size is

changed 0:2 in scale and exhaustive search

procedures are repeated again. The candidate with

the maximum similarity is adopted. The search step

in x and y directions is adopted as 10% of the object

width and height, respectively.

Exhaustive search guarantees that the global

maximum be achieved, which is superior to a

gradient-based algorithm such as the mean shift

that can only get a local maximum. Fig. 3 shows an

example. In the left image the girl’s face is tracked,

which is occluded by the man’s face nearby. The

right figure shows probability map in which the left,

global maximum corresponds to the object, and the

right, local maximum the man. The convergence of

gradient descent (ascent)-based algorithm such as

mean shift depends on the initial condition, which

may be trapped in the local maximum.

Thanks to Integral Images proposed, the similarity

measure can be evaluated at negligible computational

cost. Note that for tracking applications only

the Integral Images of the neighboring region

surrounding the object needs to be computed. It is

very efficient and thus, despite brute-force search in

the neighborhood the algorithm runs very fast.

4. Experiments

The program is written with Cþþ on a laptop

with 1.8GHz Intel Pentium-M 745 (Dothan) CPU

and 512 Memory. The cluster number d is 6 in the

proposed algorithm, and the mean shift algorithm is

implemented with 32 32 32 bins. In both algorithms

RGB color space is used. Initializations of

both algorithms are by hand in the first frame and

the ground truth is manually labelled.

Four measures are adopted to compare the two

algorithms: x, y coordinates and size of the computed

rectangle, as well as area of overlapping region

between the true bounding rectangle (ground truth)

and the computed one (tracking result). In addition,

as a measure to evaluate the amount of time in which

the object is not effectively followed, the temporal

fraction in which there is no overlap between the true

bounding rectangle and the computed one is also

used. In most of our experiments, the temporal

fraction is zero which means effective tracking

throughout the whole sequence. So in the following,

only cases where the temporal fraction are not zero

are explicitly indicated.

ARTICLE IN PRESS

Fig. 3. Exhaustive search guarantees the global maximum be

achieved. In the left image the girl’s face is tracked, which is

occluded by the man’s face nearby. The right figure shows

probability map in which the left, global maximum corresponds

to the object, and the right, local maximum the man.

680 L. Peihua / Signal Processing: Image Communication 21 (2006) 676–687

4.1. Person tracking

The experiment is conducted on a video clipped

from the image sequence (size: 388 284) named

‘‘ThreePastShop2cor.mpg’’ (frames 480–915) [18].

Among the three subjects walking in the corridor,

the one dressed in red clothe on the left side is

followed.

Note that from frame 260 the illumination varies,

and from frames 360 to 380 one person occludes the

interested subject gradually from the left. Despite

these difficulties the proposed algorithm and the

mean shift algorithm succeed in following the object

throughout the complete sequence.

The tracking errors vs. frame index are plotted in

Fig. 4, and some of typical tracking results using the

proposed algorithm are shown in Fig. 5. The

average tracking errors and time of both algorithms

are shown in Table 1. It can be seen that, the

tracking errors of x and y coordinates and scale

using the proposed algorithm are less than those

using the mean shift algorithm. The variances of y

coordinate and scale using the proposed algorithm

are less than those using the mean shift, meanwhile

the x coordinate variance of the former is a little

more than that of the latter.

During occlusion and the immediate short period

that follows (frames 360–420) the scale error of the

mean shift algorithm becomes very large, as shown

in the bottom, left-hand corner in Fig. 6. Actually in

this case size of the computed bounding rectangle

using the mean shift tends to larger and almost

encloses the true one. Therefore its area error

becomes very small in this period.

ARTICLE IN PRESS

0 50 100 150 200 250 300 350 400 450

0

5

10

15

Frame index

X coordinate error of object centroid (pixels)

The proposed algorithm

Mean shift algorithm

0 50 100 150 200 250 300 350 400 450

0

5

10

15

Frame index

Y coordinate error of object centroid (pixels)

The proposed algorithm

Mean shift algorithm

0 50 100 150 200 250 300 350 400 450

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Frame index

Scale error

The proposed algorithm

Mean shift algorithm

0 50 100 150 200 250 300 350 400 450

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Frame index

Error of overlapping region area

The proposed algorithm

Mean shift algorithm

Fig. 4. Comparison of errors for person tracking between the mean shift algorith

m (blue, dotted) and the proposed algorithm (red, solid).

From left to right, top to bottom, are shown errors of x, y, scale and overlappi

ng region area versus frame index.

L. Peihua / Signal Processing: Image Communication 21 (2006) 676–687 681

Table 1 also shows the average tracking time per

frame for the mean shift algorithm (16 ms) and the

proposed algorithm (7 ms) in which most time

(5 ms) is taken by computation of Integral Images.

4.2. Human face tracking

The face image sequence (size: 256 192) is

recorded in a typical office environment [17].

Comparisons between the two algorithms are shown

in Fig. 6 and average errors and variances in Table 2.

Some of typical tracking results using the proposed

algorithm are presented in Fig. 7. Note that tracking

errors of both algorithms in this video stream are

larger than those of person sequence. It is not

surprising because the face sequence is more challenging

due to motion of both the camera and the

subject, disappearance of the object, severe illumination

changes and occlusion by a similar object.

From frames 140 to 165 the subject gradually

turns her back towards the camera and the face

becomes invisible, and in the following consecutive

100 frames the illumination changes are considerable.

The face becomes unseen again when the girl

turns around from frames 270 to 360.When the face

is invisible both trackers deviates from the target

and the errors becomes large. The reason for this is

that the reference color model is built from the

subject’s frontal face. Thanks to the reference color

model that contains some pixels of hair the

deviation is not much and tracking recovers when

the girl faces the camera again.

From frames 630 to 710 a man’s face gradually

occludes and un-occludes the tracked face and Fig.

8 shows different behaviors of the two algorithms.

When a quite similar object appears nearby, two

local maxima appear (please refer to Fig. 3), the

gradient-based mean shift is trapped in a local

maximum and locks on the man’s face. It can been

seen from Fig. 6 that errors of x, y and scale of the

mean shift becomes very large. But the proposed

algorithm performs exhaustive search and so

succeeds to handle this situation.

The average errors of x, y coordinates and scale

using the proposed algorithm are all less than those

using the mean shift algorithm, as Table 2 shows.

ARTICLE IN PRESS

Fig. 5. Some of typical tracking results using the proposed algorithm. From left

to right, top to bottom, are shown frames

20; 80; 148; 220; 322; 369; 381 and 430.

Table 1

Comparison of tracking errors (means standard variances) and time for person tra

cking

X error (pixels) Y error (pixels) Scale error (%) Area error (%) Tracking timea

(ms)

Mean shift 2:3 1:8 4:7 2:7 0:12 0:26 0:14 0:15 16

The proposed 2:0 1:9 3:4 2:3 0:02 0:02 0:15 0:11 7 (5)

aThe data in parenthesis is the average time to compute the Integral Images.

682 L. Peihua / Signal Processing: Image Communication 21 (2006) 676–687

The average tracking time is 20ms for the mean

shift and 15 ms for the proposed in which 12 ms is

taken by computation of Integral Images. The time

fraction of the proposed algorithm is significantly

less, which indicates that in most frames the object

is successfully tracked.

4.3. Performance evaluation of the proposed

algorithm vs. cluster number and color space

The cluster number in the above experiments is 6,

and it is interesting to see performance variation vs.

cluster number, which is shown in Table 3 for

ARTICLE IN PRESS

0 100 200 300 400 500 600 700 800

0

10

20

30

40

50

60

70

Frame index

X coordinate error of object centroid (pixels)

The proposed algorithm

Mean shift algorithm

0 100 200 300 400 500 600 700 800

0

20

40

60

80

100

120

Frame index

Y coordinate error of object centroid (pixels)

The proposed algorithm

Mean shift algorithm

0 100 200 300 400 500 600 700 800

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

Frame index

Scale error

The proposed algorithm

Mean shift algorithm

0 100 200 300 400 500 600 700 800

0

0.2

0.4

0.6

0.8

1

Frame index

Error of overlapping region area

The proposed algorithm

Mean shift algorithm

Fig. 6. Comparison of errors for face tracking between the mean shift algorithm

(blue, dotted) and the proposed algorithm (red, solid).

From left to right, top to bottom are shown errors of x, y, scale and overlappin

g region area versus frame index.

Table 2

Comparison of tracking errors (means standard variances) and time for face track

ing

X error (pixels) Y error (pixels) Scale error (%) Area errora (%) Tracking timeb

(ms)

Mean shift 12:0 13:4 17:0 26:0 0:24 0:24 0:57 0:41 ð0:22Þ 20

The proposed 10:2 10:7 13:3 14:1 0:14 0:17 0:51 0:32 ð0:09Þ 15 (12)

aThe data in parenthesis is the time fraction in which the object is not effecti

vely tracked.

bThe data in parenthesis is the average time to compute the Integral Images.

L. Peihua / Signal Processing: Image Communication 21 (2006) 676–687 683

person tracking and in Table 4 for face tracking,

respectively.

As demonstrated in Table 3, with increase of the

cluster number, scale error becomes larger whereas

area error gets less, and it is seen that y error

fluctuates. X error gradually increases from d ¼ 12

to 24 but is still less than that at d ¼ 6.

For face tracking, y, scale and area errors at d ¼ 12; 18; 24 are less than that

at d ¼ 6. It is seen that x

error at d ¼ 12 is less whereas it is larger at

d ¼ 18; 24, in contrast with that at d ¼ 6. The

tendency of consistent increase or decrease is not

obvious since with the increase of d, fluctuation of

each error is almost always observed. For both

examples, tracking time are seen on the significant

increase when cluster number grows.

In all, the performance of the proposed algorithm

will be slightly improved with increase of cluster

number, however, at the cost of consumption of

much more CPU time. It shows that it is generally

sufficient for the proposed algorithm to describe

well color information of a target with a small

number of cluster number.

For the sake of simplicity of the color model and

computational efficiency, assumption is made that

gray-level distribution in different RGB channel is

independent. Although correlations exist between

channels in RGB space, experiments in Sections 4.1,

ARTICLE IN PRESS

Fig. 8. Comparison of two algorithms when a similar object occludes the subject.

Top row shows results with the proposed algorithm and

bottom row with mean shift algorithm. From left to right shown are frames 630; 6

62; 670; 690 and 700.

Table 3

Performance vs. cluster number using the proposed algorithm for person tracking

Cluster numberd Xerror (pixels) Y error (pixels) Scale error (%) Area error (%)

Tracking timea (ms)

6 2:01 1:94 3:37 2:34 0:021 0:021 0:153 0:109 7 (5)

12 1:75 1:57 4:30 2:41 0:039 0:097 0:132 0:100 15 (12)

18 1:78 1:48 2:78 2:45 0:060 0:093 0:057 0:036 24 (19)

24 1:83 1:55 3:62 2:75 0:074 0:101 0:064 0:039 32 (26)

aThe data in parenthesis is the average time to compute the Integral Images.

Fig. 7. Some of typical tracking results using the proposed algorithm. From left

to right, top to bottom, shown are frames

1; 90; 160; 260; 320; 378; 460 and 700.

684 L. Peihua / Signal Processing: Image Communication 21 (2006) 676–687

4.2 and 4.4 prove that the color model under such a

assumption works well. Full consideration of

covariance matrix may improve performance of

the algorithm, however, at the cost of huge increase

of computational load as lack of fast algorithm

currently.

It is interesting to see the performance of the

proposed algorithm in other color spaces with

greater channel separation, particularly in YCbCr,

CIELAB and HSV color spaces. For person

tracking, as Table 5 shows, in comparison with

errors in RGB space, y error in YCbCr space

increases while the other three decrease, and almost

all errors increase in CIELAB and HSV spaces. For

the more challenging face sequence, as shown in

Table 6, tracker fails in both YCbCr and HSV

spaces, where the object is lost from 370 in the

former and from 150 in the latter and never

recovers. In CIELAB space, x and y errors are

larger than those in RGB space, whereas scale error,

area error and the time fraction are less than those

in RGB space.

From experiments above, we see that among

some factors including independence assumption we

made, illumination and appearance changes may

play dominant roles in affecting performance of one

tracking algorithm in different color spaces. We

note that, to handle the above problem, some

researchers investigate how to dynamically select

the best one from many color spaces [16] or the best

color features based on linear combination of

different channels in a color space [3].

4.4. More tracking results

More experiments are made to testify the

performance of the algorithm on image sequences

accommodating different scenarios, where sequences

1 and 2 are both concerned with vehicle

tracking, and sequences 3 and 4 pedestrian tracking.

Tracking results are summarized in Table 7.

In sequence 1 (frames 560 to 760, size: 768 576)

[9], a car was moving on the highway at an

accelerating speed the back of which was captured

ARTICLE IN PRESS

Table 5

Performance vs. color space using the proposed algorithm for person tracking (cl

uster number is 6)

Color space X error (pixels) Y error (pixels) Scale error (%) Area error (%)

RGB 2:01 1:94 3:37 2:34 0:021 0:021 0:153 0:109

YCbCr 1:52 1:67 3:81 2:54 0:019 0:079 0:130 0:108

CIELAB 2:76 2:94 4:03 2:18 0:008 0:010 0:200 0:146

HSV 2:58 2:23 5:99 3:77 0:036 0:035 0:324 0:217

Table 4

Performance vs. cluster number using the proposed algorithm for face tracking

Cluster numberd Xerror (pixels) Y error (pixels) Scale error (%) Area errora (%)

Tracking timeb (ms)

6 10.15710.71 13.30714.07 0.14370.173 0.51470.316 (0.094) 15 (12)

12 8.85710.51 10.74711.16 0.09870.147 0.38670.315 (0.011) 35 (32)

18 10.5179.58 11.34713.15 0.09770.121 0.42770.319 (0.040) 50 (44)

24 10.9871.68 9.79710.34 0.11570.162 0.40970.329 (0.045) 67 (61)

aThe data in parenthesis is the time fraction in which the object is not effecti

vely tracked.

bThe data in parenthesis is the average time to compute the Integral Images.

Table 6

Performance vs. color space using the proposed algorithm for face tracking (clus

ter number is 6)

Color space X error (pixels) Y error (pixels) Scale error (%) Area errora (%)

RGB 10:15 10:71 13:30 14:07 0:143 0:173 0:514 0:316 ð0:224Þ

YCbCr — — — —

CIELAB 12:25 11:70 18:57 21:77 0:109 0:142 0:513 0:319 ð0:165Þ

HSV — — — —

aThe data in parenthesis is the time fraction in which the object is not effecti

vely tracked.

L. Peihua / Signal Processing: Image Communication 21 (2006) 676–687 685

with a camera installed in another vehicle following

it. In this scenario, both the foreground and the

background are moving and the appearance

changes are non-trivial as the car tracked moves

farther and farther away. As seen in the first column

of Table 7, all tracking errors but y error of the

proposed algorithm are less than those of the mean

shift. It takes on average about 20ms for mean shift

in comparison with 16 ms for the proposed algorithm.

In the second sequence (frames 990 to 1350, size:

768 576) [10], one hatchback entered the view

from the left, moving forward on the road while

passing in front of a row of parked vehicles, and

finally, moving backward and parked in a slot. In

this situation the parked cars nearby which are

similar in appearance to the hatchback pose threats

to trackers. The second column of Table 7 shows

tracking results of the proposed algorithm are better

than those of the mean shift except y coordinate.

The mean shift takes about 21ms and the proposed

algorithm 19 ms to track object.

The scenario in sequence 3 (frames 208 to 430,

size: 720 576) [11] is a train station hall. A person

walked quickly to the exit of the hall, away from the

camera. As the person walked fast severe motion

blurring occurs in the appearance of the object. As

indicated by the third column in Table 7, both

algorithms have almost the same scale error. While

y error of the proposed is less than that of the mean

shift, its x error is larger. The main reason that area

error of the mean shift is less, is that, when

illumination changes from about frame 380, the

size of the computed bounding rectangle tends to

larger and almost encloses the true one. The average

tracking time is 23 for the mean shift and 10 ms for

the proposed algorithm.

In sequence 4 (frames 126 to 280, size: 720 576)

[12] a lady walks straightforward from the left to the

right. During frames 230 to 250 a man occludes in

part the lady while walking past. In this scenario all

tracking errors of the proposed are less than those

of the mean shift. The average time is 25 and 17ms

for the proposed algorithm and the mean shift,

respectively.

5. Conclusions

In the paper a color model is proposed based on

K-means clustering, in which the color space is

partitioned adaptively and the histogram bins are

determined accordingly. Moreover, the distribution

of multi-channel gray level is modelled within each

bin to catch more information on object color. To

measure similarity between two color models, a

similarity measure is defined based on Bhattacharrya

distance and its simplified form is derived.

Thanks to the Integral Images proposed, the

tracking algorithm is able to search exhaustively

but efficiently for the global maximal mode in the

neighboring region. The comparisons with the wellknown

mean shift show that the proposed algorithm

has better performance while retaining the same (or

less) computational cost.

Currently the bin number is empirically set, which

is applicable to all our experiments. Nevertheless it

is desirable to automatically determine the number

of bins to account for illumination changes or noise

ARTICLE IN PRESS

Table 7

Comparisons of tracking results with different image sequences

Algorithm Sequence 1 Sequence 2 Sequence 3 Sequence 4

X error Mean shift 3:6 3:7 8:0 7:1 2:2 2:0 8:8 9:6

Proposed 2:9 2:5 6:0 4:6 3:2 2:2 7:1 7:5

Y error Mean shift 1:6 1:6 4:0 3:0 3:0 2:4 10:1 14:3

Proposed 1:8 1:6 8:0 7:5 2:6 2:2 9:1 6:7

Scale error Mean shift 0:0278 0:0314 0:041 0:026 0:021 0:022 0:118 0:087

Proposed 0:0114 0:0120 0:026 0:018 0:021 0:022 0:035 0:050

Area error Mean shift 0:0554 0:0393 0:319 0:124 0:280 0:118 0:301 0:370

Proposed 0:187 0:0951 0:315 0:109 0:411 0:085 0:255 0:271

Timea Mean shift 20 21 23 25

Proposed 16 (14) 19 (16) 10 (8) 17 (15)

Unit of X, Y error: pixels; unit of tracking time: ms; unit of scale and area: %

.

aThe data in parenthesis is the average time to compute the Integral Images.

686 L. Peihua / Signal Processing: Image Communication 21 (2006) 676–687

while retaining a good discriminative power. Once

the Integral Images are computed, the color model

can be evaluated very fast. Applications of the

Integral Images along with the color model are

therefore possible to tasks where a brute-forth

yet efficient search is needed, such as object

detection and sub-image retrieval, which are our

future work.

Acknowledgments

The work was supported by the National Natural

Science Foundation of China (NSFC) under Grant

Number 60505006, Natural Science Foundation of

Hei Long Jiang Province (F200512), Science and

Technology Research Project of Educational Bureau

of Hei Long Jiang Province (1151G033),

Postdoctoral Fund for Scientific Research of Hei

Long Jiang Province (LHK-04093) and Science

Fund of Hei Long Jiang University for Distinguished

Young Scholars (JC200406).

References

[1] S.T. Birchfield, S. Rangarajan, Spatiograms versus histograms

for region-based tracking, in: Proceedings of the

IEEE Conference on Computer Vision and Pattern

Recognition, San Diego, CA, USA, June 2005,

pp. 1158–1163.

[2] H.-D. Cheng, X.-H. Jiang, Y. Sun, J. Wang, Color image

segmentation: advances and prospects, Pattern Recognition

34 (12) (2001) 2259–2281.

[3] R. Collins, Y. Liu, On-line selection of discriminative

tracking features, in: Proceedings of the IEEE Conference

on Computer Vision, Nice, France, 2003, pp. 346–352.

[4] R.T. Collins, Mean-shift blob tracking through scale space,

in: Proceedings of the IEEE Conference on Computer Vision

and Pattern Recognition, 2003, pp. 234–241.

[5] D. Comaniciu, V. Ramesh, P. Meer, Real-time tracking of

non-rigid objects using mean shift, in: Proceedings of the

IEEE Conference on Computer Vision and Pattern Recognition,

2000, pp. 142–149.

[6] A. Gersho, R. Gray, Vector Quantization and Signal

Compression, Kluwer Publishers, Dordrecht, 1992.

[7] A.K. Jain, R.C. Dubes, Algorithms for Clustering Data,

Prentice-Hall, Englewood Cliffs, NJ, 1988.

[8] A.K. Jain, M. Murthy, P. Flynn, Data clustering: a review,

ACM Comput. Rev. 31 (3) (1999) 264–323.

[9] PETS2001 datasets, The University of Reading, UK, found

at URL: hhttp://peipa.essex.ac.uk/ipa/pix/pets/PETS2001/

DATASET5/TESTING/CAMERA1_JPEGS/i.

[10] PETS2001 datasets, The University of Reading, UK, found

at URL: hhttp://peipa.essex.ac.uk/ipa/pix/pets/PETS2001/

DATASET5/TRAINING/CAMERA1_JPEGS/i.

[11] PETS 2006 dataset S7 camera 4, ISCAPS consortium, found

at URL: hhttp://ftp.cs.rdg.ac.uk/PETS2006/S3-T7-A.zipi.

[12] PETS 2006 dataset S7 camera 3, ISCAPS consortium, found

at URL: hhttp://ftp.cs.rdg.ac.uk/PETS2006/S3-T7-A.zipi.

[13] F. Porikli, Integral histogram: a fast way to extract

histograms in cartesian spaces, in: Proceedings of the IEEE

Conference on Computer Vision and Pattern Recognition,

San Diego, CA, USA, 2005, pp. 829–863.

[14] Y. Raja, S.J. McKenna, S. Gong, Colour model selection

and adaptation in dynamic scene, in: Proceedings of the

European Conference on Computer Vision, 1998,

pp. 460–474.

[15] C. Stauffer, W.E. Grimson, Learning patterns of activity

using real-time tracking, IEEE Trans. Pattern Anal.

Machine Intell. 22 (8) (2000) 747–757.

[16] H. Stern, B. Efros, Adaptive color space switching for

tracking under varying illumination, Image Vision Comput.

23 (3) (2005) 353–364.

[17] Test image sequences for face tracking by Stan Birchfield,

found at URL: hhttp://vision.stanford.edu/birch/headtracker/

seq/i.

[18] The EC Funded CAVIAR project/IST 2001 37540, found at

URL: hhttp://homepages.inf.ed.ac.uk/rbf/CAVIAR/i.

[19] P. Viola, M. Jones, Rapid object detection using a boosted

cascade of simple features, in: Proceedings of the IEEE

Conference on Computer Vision and Pattern Recognition,

2001, pp. 511–518.

[20] H. Wang, P. Li, T. Zhang, Proposal of novel histogram

features for face detection, in: International Conference on

Advances in Pattern Recognition, Bath, UK, 2005,

pp. 334–343.

[21] C. Wren, A. Azarbayejani, T. Darrell, A.P. Pentland,

Pfinder: real-time tracking of the human body, IEEE Trans.

Pattern Anal. Machine Intell. 19 (7) (1997) 780–785.

[22] C. Yang, R. Duraiswami, L. Davis, Efficient mean-shift

tracking via a new similarity measure, in: Proceedings of the

IEEE Conference on Computer Vision and Pattern Recognition,

2005, pp. 176–183.

[23] Q. Zhao, H. Tao, Object tracking using color correlogram,

in: IEEE Workshop on VS-PETS, 2005.

ARTICLE IN PRESS

L. Peihua / Signal Processing: Image Communication 21 (2006) 676–687 687

- 01115319Uploaded byhema1latha_1988
- Practicum 7 STUDYguiUploaded bynicoletaodagiu
- Hybrid Data Clustering Approach UsingUploaded byAnonymous IlrQK9Hu
- Document Clustering Using Particle Swarm OptimizationUploaded byAAA
- Retrieval of Textual and Non-textual Information InUploaded byInternational Journal of Research in Engineering and Technology
- GeneticUploaded byNeeraj Gupta
- A Two Step Clustering Method for Mixed Categorical and Numerical DataUploaded byaparna_yedla
- Bent Ez 2016Uploaded bykalokos
- 3. Project ReportUploaded byM Shahid Khan
- An efficient enhanced k-means clustering algorithmUploaded byahmed_fahim98
- An Efficient Approach for Multi-Target Tracking in Sensor Networks using Ant Colony OptimizationUploaded byInternational Journal for Scientific Research and Development - IJSRD
- Illustration of Medical Image Segmentation based on Clustering AlgorithmsUploaded byEditor IJRITCC
- GA ClusteringUploaded byJosue
- 04 LEC Data Science KmeansUploaded byViram Shah
- Clustering Student Data to Characterize Performance PatternsUploaded byEditor IJACSA
- 16. Extraction of Tumor and Cancer Cells of Brain MRI Images by using different Morphological Operations.pdfUploaded byhub23
- Review on Automatic and Customized Itinerary Planning Using Clustering Algorithm and Package Recommendation for Tourism ServicesUploaded byEditor IJRITCC
- Back Propagated K-Mean Clustering for Prediction of Slow LearnersUploaded byEditor IJTSRD
- ICAET-T1-14-172Uploaded byKarthik R
- Research on Web Session ClusteringUploaded byNghe Nhin
- 1-s2.0-S0020025517306266-mainUploaded byRaul Arredondo Flores
- Is There a Relationship BetweenUploaded byAnshul Singh
- Performance Evaluation of Clustering AlgorithmsUploaded byseventhsensegroup
- TWO PARTY HIERARICHAL CLUSTERING OVER HORIZONTALLY PARTITIONED DATA SETUploaded byLewis Torres
- Development of an Adaptive Soft Sensor Based on FcmilssvrUploaded byIJSTR Research Publication
- Unit-03 (Part 2)Uploaded byGaurav Jaiswal
- A Comparative Study of CN2 Rule and SVM AlgorithmUploaded byAlexander Decker
- M.TechUploaded byAli
- An Effective Algorithm for Mining and Grouping Online Transactions in Online SystemsUploaded byseventhsensegroup
- Tutorial 4Uploaded byTharmanSiva

- [IJCST-V4I3P45]:Shruti Pardhi, Mrs. K. H. WanjaleUploaded byEighthSenseGroup
- Traffic Monitoring Guide (FHWA)Uploaded byuhope
- tmpC7D4.tmpUploaded byFrontiers
- IJAIEM-2014-02-28-078Uploaded byAnonymous vQrJlEN
- ch1Uploaded byJosyula Krishna
- A Framework for Automatic Modeling From Point Cloud DataUploaded byMichele Vece
- DETECTION OF MELANOMA USING ASYMMETRIC FEATURESUploaded byijaert
- Common Questions and Answers About ABAPUploaded bymhku1
- Nonnegative Matrix Factorization for Interactive Topic Modeling and Document ClusteringUploaded byDa Kuang
- Too Many MarkersUploaded byShradesh Paiyyar
- A Novel Clustering Method for Similarity Measuring in Text DocumentsUploaded byIJMER
- CARUSO, SOSA ESCUDERO & SVARC (2015).pdfUploaded byEmilia Millon
- An Efficient Constrained K-Means Clustering using Self Organizing MapUploaded byijcsis
- Brain Tumor Detection through MR Images: A Review of LiteratureUploaded byIOSRjournal
- Introduction to Clustering Large and High-Dimensional Data, 0521852676, 2006Uploaded byAnonymous QlJjisdlLI
- Web Mining reportUploaded byNini P Suresh
- A Genetic Algorithm Based Source Code Mining Approach for Language MigrationUploaded byEditor IJRITCC
- IJCNCUploaded byAIRCC - IJCNC
- Matrix Data Analysis ChartUploaded byMeng Arnold
- Introduction and Overview.pdfUploaded byAnusha Pillay
- Acceptance Procedures RadioUploaded byAnonymous zE7GEC1mX
- FLGMM AlgorithmUploaded bysrisairampoly
- Data Warehousing & MiningUploaded byArvind Pandi Dorai
- Clustering Based Fuzzy Logic - FinalUploaded byRaghu Vamsi Potukuchi
- ieeeUploaded byshreechu
- Exercises with PRTools.pdfUploaded bymusmankhan
- Computer Science EngineeringUploaded bysrinivasa_ds2k
- Ba Spss StepsUploaded bySumit Chauhan
- Cluster.docxUploaded byshankar_mission
- Fabric Defect Detection Based on Fuzzy C-Mean AlgorithmsUploaded byEditor IJRITCC