You are on page 1of 9

Real Time Embedded Cam Shift Object Tracking

Algorithm using Davenci TMS320DM6437 EVM


1
Khidir Elhadi Bala Ahmed, 2Rania A. Mokhtar, 2Rashid A. Saeed
1
Electronic Design and Research Department, Sudan Academy for Science, Khartoum Sudan
2
Sudan University of Science and Technology (SUST), Sudan
khidir_mph@hotmail.com

Abstract — Object tracking in real time is provides information about the object, such
one of the most important topics in the field of as orientation, area or shape of that object
computer Vision. The work undertaken in [1].
this dissertation is mainly focused on Most of recently developed tracking
development of a reliable and robust real time algorithms use the following principles:
tracking system that can track the object of correlation methods, optical flow,
interest in the video acquired from a
background subtraction, particle filtering,
stationary or moving camera.
The proposed algorithm is a real time
methods based on probability density
algorithm that operates in 25 frames per evaluation, etc. The correlation and optical
second depending on the input video flow methods are distinguished with their
properties and probability distribution of the high computational complexity making them
intensity of the tracked object. hardly suitable for real-time applications [2].
The Digital Video Development Platform The background subtraction has low
DM6437 EVM is used to obtain the real time robustness in the presence of noise, and
video sequence and process the frame to cannot work when the observation camera is
extract the object using the CAMSHIFT moving.
algorithm. Research has been carried out for many
The information obtained from the frame is years in this field. In this sequel, algorithms
extracted and mean shift algorithm is used to have been developed and tested on desktop
search the object with in the frame. The
machines [3].
experimental results obtained from the
proposal prove the consistency and real time
Real-time object detection and tracking is a
performance of the proposed algorithm. critical task in many computer vision
Index-CCS, CAMSHIFT, Histogram, applications such as surveillance, driver
histogram projection, intensity, pixel, real assistance, gesture recognition and man
time, TM320DM6437, YUV, VPFE, VPBE, machine interface [4].
VPSS. For machines efficient object tracking is a
difficult task. The computational complexity
I. INTRODUCTION of the algorithm is critical for most
applications. Several target tracking
Tracking is defined as the estimation of the
techniques have been presented by many
trajectory of an object in an image plane
researches in past. Most of the techniques
while it is moving around the scene. By
are dependent on limited field of view from
other means, a tracker assigns labels to the
a stationary camera and requires lots of
tracked object consistently in different
human computer interaction [5].
frames of the video. In addition, with respect
The operational requirements on a video
to the tracking domains, a tracker also
tracking system are Target detection, target
auto track, data collection, and real-time data
reduction.
Moreover the primary performance factors
are; target characteristics, acquisition range,
tracking accuracy and data resolution.
The target characteristics which must be
considered in the tracking system are; size,
target radiance, background radiance,
dynamics and range.
An embedded system is a dedicated
computer-based system for an application(s)
or product. For our applied problem we used
a digital signal processor (DSP) oriented to
video image processing. Particularly, we
took the DSP, type TMS320DM6437, used
as a basis for the debugging means called
Spectrum Digital DM6437 EVM – in
accordance with the task to be solved.
The continuous development of modern Figure 1 video tracking procedures
industrial technology has put forward higher
requirements for the acquisition, processing The TMS320DM643x Digital Media
and transmission of image information. In Processor (DMP) contains a powerful DSP to
recent years, high-speed digital signal efficiently handle image, video, and audio
processing technology develops rapidly and processing tasks. The DM643x DMP
the strong signal processing ability of DSP consists of the following primary
and provides solid theoretical and components and sub-systems [7]:
application foundation for the real-time 1.DSP Subsystem (DSPSS), including
processing of image information. Embedded the C64x+ MEGAMODULE and
platform has features like smaller size, low- associated memory.
cost, low power consumption and low 2.Video Processing Subsystem (VPSS),
maintenance [6]. including the Video Processing Front
Therefore, developing an image processing End (VPFE) Subsystem, Image Input
system on the embedded platform can and Image Processing Subsystem,
improve these systems in case of reliability and the Video Processing Back End
and controllability. Comparing with the (VPBE) Display Subsystem.
traditional image processing system, this 3.A set of I/O peripherals.
system will have small size, low cost, good 4.A powerful Direct Memory Access
stability and real-time character. Figure (1) (DMA) subsystem and DDR2
shows the overall tracking procedures. memory controller interface.
The CCD controller of the VPFE can capture
BT.656 formatted video or generic 16-bit or
8-bit YUV digital video data from a digital
video source such as an NTSC/PAL video
decoder [8]. Table 1 shows the DDR Storage
Format for YCbCr Processing.

Table1 DDR Storage Format for YCbCr


Processing.

SDRA Upper word Lower word


M
Addres
s
MSB(1 LSB( MSB(1 LSB(
5) 0) 5) 0)
N Y1 Cr0 Y0 Cb0
N+1 Y3 Cr2 Y2 Cb2
N+2 Y5 Cr4 Y4 Cb4

The VPBE block is comprised of the on-


screen display (OSD) and the video encoder
(VENC) modules. Together, these modules
provide a powerful and flexible back-end
display interface.
Figure 2. TMS320DM6437 video processing
• On-screen display (OSD) graphic
subsystem block diagram.
accelerator: The OSD manages display data
in various formats for several types of
The video processing subsystem
hardware display windows and it also
(VPSS) provides an input interface (VPFE)
handles blending of the display windows into
for external imaging peripherals such as
a single display frame, which the video
image sensors, video decoders, etc. and an
encoder (VENC) module then outputs.
output interface (video processing back end,
• Video encoder (VENC): The VENC takes
(VPBE)) for display devices, such as analog
the display frame from the on-screen display
SDTV displays, digital LCD panels, HDTV
(OSD) and formats it into the desired output
video encoders, etc. Figure (2) shows the
format and output signals (including data,
functional block diagram of the
clocks, sync, etc.) that are required to
TMS320DM6437 video processing
interface to display devices[9].
subsystem.
The VPFE supports image data acquisition
II. CAMSHIFT Algorithm
from sensor and digital video sources in
The CAMSHIFT algorithm was
various mode/formats. YUV sources have
developed for effective face-and-head
minimal image processing applied and can
tracking in perceptual user interfaces. Its
either be passed directly to external
main part represents a robust non-parametric
memory/DDR2 or passed to the resizer for
technique for climbing density gradients
scaling prior to writing to DDR2. Raw
permitting to find the peak of probability
imager data modes (non-YUV sources) are
density [10].
supported by the statistics collection modules
Some drawbacks of the mean shift algorithm
(H3A and histogram) as well as full preview
are that it fails to track multi-hued targets; it
engine image signal processing functions,
may fail to track the target when its shape
plus resizing after preview.
and orientation changes and it fails to track the search window using mean-shift
the fast moving objects [1]. algorithm.
This is an easy process for continues E. Add the Center of the search window
distributions which merely just hill climbing to the calculated object Centroid and
applied to a density histogram of data. This iterate step D until minimum distance.
method is efficient compare to standard F. Process the next frame with the search
template matching since it eliminates brute window position from the step E.
force search [3].
Mean shift calculation steps: III. Problem statement
1- Zeroth moment calculation: Many researches have been done to
𝑀00 = implement the above steps on and embedded
∑𝑥 ∑𝑦 𝐼(𝑥, 𝑦) (1) processors such as our target processor
2- First moment calculation : TMS320DM4637, the results found after
𝑀01 = ∑𝑥 ∑𝑦 𝑥 ∗ code optimization about 125 milliseconds
𝐼(𝑥, 𝑦) (2) per frame besides the failure of tracking
𝑀10 = ∑𝑥 ∑𝑦 𝑦 ∗ black objects and failure of tracking fast
moving objects as mentioned in [1]and [12].
𝐼(𝑥, 𝑦) (3)
3- Mean search window calculation : IV. Proposed Solution
Starting from the video format and video
𝑥𝑐 =
𝑀01 processing sub system (VPSS) block set of
(4) the evaluation module TMS320DM6437, the
𝑀00
𝑴 video processing front end (VPFE) and
𝒙𝒄 = 𝑴𝟎𝟏 (5)
𝟎𝟎 video processing back end (VPBE) captures
the video from the input camera and
where I (x, y) is the intensity value of point displayed the video frames on required color
(x, y) in the Probability distribution of the formats according to the settings[7]and[8].
image. The approach here uses the YUV format and
The procedure of CAMSHIFT includes two then in pixel manipulation we just use Y
important parts which are the histogram and component to calculate the back projection
the search of peak probability. The first step histogram and probability distribution
of CAMSHIFT is to obtain the histogram of function of the target.
tracked object. In the second step, the next The second point is thresholding the image
frame will be converted into a map of skin according to the object histogram and
color probability based on the histogram. In unifying the object intensity, Then
the third step, the peak probability center calculating center of the object using
will be found based on zero moment and following equations:
first moment [3] [11]. 1- Zeroth moment calculation:
Generally CAMSHIFT algorithm 𝑀00 =
calculates the target center position by using ∑𝑥 ∑𝑦 𝑡𝑎𝑟𝑔𝑒𝑡 𝑝𝑖𝑥𝑒𝑙𝑠 (6)
the following procedure:
2- First moment calculation :
A. First of all Chose the initial region of
𝑀01 =
interest which contains the object we
∑𝑥 𝑥 (7)
want to track
𝑀10 =
B. Make a color histogram of that region
∑𝑦 𝑦 (8)
as the object model.
C. Let the probability distribution of Where x and y are the vertical and horizontal
frame using the color histogram. As a position of each target pixel.
sign, in the implementation, they use 3- Mean search window calculation :
histogram back projection method. Same as equation (5).
D. Based on the probability distribution
of the image; find the center mass of V. Methodology
The experimental setup as shown in Figure 3 The next step is to calculate the histogram of
and Figure 4 was installed and the C code the object and background to threshold the
was written and compiled using Code image and unifying the intensity of the target
Composer Studio software which is and background using Y component, then
designed by Texas Instrument Company to the probability distribution of frame was
work with the evaluation module defined as a sign.
TMS320DM6437. The proposed solution for decreasing the
amount of bytes to be calculated is
modification of equations (1),(2) and (3) to
equations (5),(6) and (7) respectively, and
the center of mass of the search window
using mean-shift algorithm was calculated
and found based on the color probability
distribution of the image. After that the
center of the search window was added to
the object’s Centroid and the image was
iterated again until minimum distance. The
Figure 3: experimental set up next frame with the search window position
from the last step was processed.
VI. Results and Discussion
Test procedures are established according to
Table 2 parameters and quantities.

Table(2) Design Parameters


Parameter Quantity
Input Video format PAL
Figure 4: system interaction block diagram Frame processing time <40 ms
Target minimum size <50 pixels
The PAL video input was selected by setting
Search window size 200×100
the evaluation module TMS320DM6437 to
this format which is received from the day Target color Different colors
light camera.
The width of the image is 720 pixels per line The C code was compiled and loaded
and each pixel is tow components (YU) or to the evaluation module and the target was
(YV), the Y component is one byte and the selected by lock command sent from PC
U or V component is one byte; so in each UART.
line we have 1440 bytes. The number of Tracking the selected target started
horizontal lines (image height) is 480 lines from the center of the field of view and the
which were selected by the iteration target was tracked. This procedure was done
parameters. with different types of targets different sizes;
In frame processing we used the Y different color distributions and different
component which is quite enough for our speeds according to the distances form the
proposal to calculate the histogram of the camera and figures (5),(6),(7),(8) and (9)
selected object to be tracked. were taken as the sample results of the test.
The target is selected by sending lock The images were captured from the
command to the evaluation module through evaluation module output and each image
UART from computer serial port by C# shows the target in the tracking window. On
software which is designed to send this the top left of each image the frame number
command. The vertical and horizontal was written and the number on the left
histogram projection was calculated to find Center of the image is the quantity of the
the target size. pixels which represent the target and on the
bottom left of the image the center of the
target position was written in horizontal and
vertical mod.
The time needed for frame processing code
measurements were performed with the aid
of CLK_gethtime() procedure available in
DSP/BIOS [13].
Also no any special camera settings had
been changed or any color conversions had
been done to aid the algorithm to search and
track the object.
As shown in Figure 5 the object number of Figure7 yellow and small size object
pixels is large it is about 19996 pixels and tracking
the object was tracked in all frames with
different moving speeds.

Figure 8 Non rigid object tracking.

Figure 5: large size object tracking

Figure 9 black object tracking

Also Figures (-6- ) and (-7-) are small size


Figure 6: small size object tracking objects
In figure (-5-) the target was small with
different color distribution the objects have
been tracked in all video frames till the
release command.
Figures 8 and 9 proves that the tracking
algorithm is working good with this types of
target; the non –rigid objects can be tracked
easily and the black objects also can be
tracked with no any problem of dividing by
zero, because one of the previous researches
had been done on this approach and the
resulting algorithm fails on tracking black
objects. Besides on of the researches fails on If we compare the results shown in figures (-
tracking fast moving objects. 10-) and (-11-) with the results[12] the
The processing time of each frame was processing time is decreased from 125 ms
measured and the resulting processing time after optimization to 93.5 ms after
as shown in Figure 10 is about 39.5 optimization. Also the algorithm can track
milliseconds this proves that the algorithm is different color objects including black
applicable in real time applications. objects.
VII. Conclusion
We can conclude that the designed system is
able to track easily various color objects,
even in the case of changing their size and
shape besides tacking fast objects. The
algorithm proves it success to search and
localize each target in real time on the
proposed embedded plat form.
This system with this algorithm is applicable
in surveillance system, clear background
target tracking and also can be improved to
Figure 10: Frame processing time per work as automatic moving object detector
millisecond and tracker.
Still a lot of improvements are needed on
The target size also had been tested to check this approach to increase the performance
the effect of number of calculations in the property and value of this algorithm
processing time and also the results proves depending on the application needs.
that the time is quite enough to calculate
largest object sizes as shown in figure (-11-) I. REFERENCES
in all frames the processing time is about [1] J. Navin Sankar, S. Mary Joans, S. J. Grace
39.5 milliseconds. Shoba, A. Arun, Enhanced Object Tracking Using
The advantage of selecting Y component Davinci Processors, International Journal of Soft
Computing and Engineering (IJSCE)ISSN: 2231-
from the YUV format is decreasing the 2307, Volume-3, Issue-1, March 2013.
number of manipulated bytes for each pixel [2] Lee. Roichman, Y. Solomon, Y. Moshe, “Real-
to 50% of the total bytes. Besides the Time Pedestrian Detection and Tracking” Proc. of the
multiplication and summation processes 3rd European DSP Education and Research
decreased by unifying the target intensities Symposium (EDERS 2008), Tel-Aviv, June 2008, pp.
281-288.
and just counting the target pixels. [3] Afef Salhi and Ameni Yengui Jammoussi,
“Object tracking system using Camshift, Meanshift
and Kalman filter”, WorldAcademy of Science,
Engineering and Technology, 2012.
[4] Krutika A Veerapur, Ganesh V. Bhat, Colour
Object Tracking On Embedded Platform Using Open
CV, International Journal of Recent Technology and
Engineering (IJRTE) ISSN: 2277-3878, Volume-2,
Issue-3, July 2013.
[5] S. Kanwal , M. H. Yusuf, Optimization of Real
T ime Edge Enhanced Object Tracking Algorithm on
Video Processor Technical Journal, University of
Engineering and Technology (UET) Taxila, Pakistan,
V ol. 19 No. IV -2014.
[6] S.Mahesh kumar, Anitha Julian, An Embedded
Digital Image Processing System using Blackfin 548
Figure11: Processing time for each frame DSP, International Conference on Computing and
according to the object number of pixels.
Control Engineering (ICCCE 2012), 12 & 13 April, [11] Greice Martins de Freitas, Clésio Luis Tozzi,
2012 “Object Tracking by Multiple State Management and
[7] Texas Instrument,TMS320DM643x DMP DSP Eigen background Segmentation”, International
Subsystem Reference Guide Literature Number: Journal of Natural Computing Research, 1(4), 29-
SPRU978E March 2008 36,October-December 2010.
[8] Texas Instruments, TMS320DM643x DMP [12] Anton Varfolomieiev, Oleksandr Antonyuk,
Video Processing Front End (VPFE) User's Guide, Oleksandr Lysenko, Camshift Object Tracking
Literature Number: SPRU977C November, 2009. Algorithm Implementation On DM6437 EVM, The
[9] Texas Instruments, TMS320DM643x DMP 4th European DSP in Education and research
Video Processing Back End (VPBE) User's Guide, conference,2010.
Literature Number: SPRU952A, December 2007. [13] Texas Instrument, TMS320C6000 DSP/BIOS
[10] G. Bradski “Computer Vision Face Tracking For 5.x Application Programming Interface (API)
Use in a Perceptual User Interface,” in Intel Reference Guide, Literature Number: SPRU403S
Technology Journal, pp. 1–15, Q2. 1998. August 2012.
9

You might also like