You are on page 1of 4

An Intelligent Video Surveillance System

Siyuan Gao
Changchun Institute of Optics, Fine Mechanics and Physics,
Chinese Academy of Sciences
Changchun, Jilin Province, China

Abstract—This paper presents an intelligent video surveillance MPEG-4 standard and then transmits them to the control center.
system. The system is composed of one or more nodes flexibly At the same time, it runs the algorithm of moving object
according to the application scenarios such as private properties, detection in real-time and sends alarm to the control center if
banks and museums. Each node is an autonomous vision-based abnormal situations occur.
device capable to perform intelligent tasks. It is able to digitize
The paper is organized into five sections. Section 2 presents
and compress the acquired analog video signals in MPEG-4
standard and then transmit the compressed video stream to the a robust algorithm of moving object detection using a statistical
control center. At the same time, the node makes use of a approach for background subtraction. Section 3 presents the
statistical approach for real-time detecting moving object and hardware platform and the structure of the system. Section 4
online alarm generation to enable a single human operator to shows the experimental results. Finally some conclusion is
monitor activities over a complex area using a distributed made in section 5.
network of active video sensors. The node is implemented on a
platform with high performance to ensure that the algorithm is II. THE ALGORITHM OF MOVING OBJECT DETECTING IN
able to run in real-time. Applications demonstrate that the THE INTELLIGENT NODE
intelligent system has an excellent performance in many
scenarios to help people make decisions more accurately and
The algorithm in this paper can be sort as a method of
rapidly. background subtraction. The basic scheme of background
subtraction is to subtract the image from a reference image that
Keywords-component; intelligent video surveillance; moving models the background scene [5]. The basic steps are shown as
object detection; background subtraction; TMS320DM6446 follows:
A. Background Modeling
In the background training process, the background is
Video surveillance system has long been in use to monitor modeled statistically on a pixel by pixel basis. A pixel is
security sensitive areas. The history of video surveillance
modeled by {Ei , si , α i , βi } where Ei is the expected color
consists of three generations of systems which are called 1GSS,
2GSS and 3GSS [1]. Considering the problem of traditional value, si is the standard deviation of color value, α i is the
video surveillance system, an intelligent video surveillance
system should be able to generate online alarm to assist human variation of the brightness distortion, and βi is the variation of
operators effectively [2]. In order to achieve this goal, the the chromaticity distortion of the
i pixel.
intelligent video surveillance system requires fast, reliable and
robust algorithms for moving object detection and a high-speed The expected color value of pixel Ei is given by
stable hardware platform that supports the complex algorithm Ei = [eR (i ), eG (i ), eB (i )] (1)
such as moving object detection, video compression and real-
time video stream transmission. where eR (i ) , eG (i ) , and eB (i ) are the arithmetic means of
This paper presents an intelligent video surveillance system the i th pixel's red, green and blue values computed over N
which is composed of one or more intelligent nodes flexibly
according to the application scenarios. Moving object detection background frames. The si is given by
plays a key role in an intelligent video surveillance system [3]. si = [σ R (i ), σ G (i ), σ B (i )] (2)
The node makes use of a robust and efficiently computed
background subtraction algorithm that is able to cope with where σ R (i ) , σ G (i ) and σ B (i ) are the standard deviation
local illumination changes, such as shadows and highlights, as th
of the i pixel's red, green and blue values computed over N
well as global illumination changes. The algorithm is based on
background frames. The brightness distortion is defined as
a computational color model which separates the brightness
I R (i) × eR (i) I (i ) × eG (i ) I (i) × eB (i)
from the chromaticity component [4]. The intelligent node is + G 2 + B 2 (3)
σ 2 R (i) σ G (i ) σ B (i)
implemented on a platform which has a processing unit with mi = 2 2 2
high computational power, large memory, and an efficient § eR (i ) · § eG (i ) · § eB (i) ·
¨ ¸ +¨ ¸ +¨ ¸
video acquisition front end. It compresses the video data in © σ R (i ) ¹ © σ G (i) ¹ © σ B (i ) ¹

978-1-4244-7161-4/10/$26.00 ©2010 IEEE

and the chromaticity distortion is defined as period. The total sample would be NXY values from a
histogram. (The image is X × Y and the number of trained
2 2 2
§ I (i ) − α i × eR (i ) · § I G (i ) − α i × eG (i ) · § I B (i ) − α i × eB (i ) ·
ni = ¨ R ¸ ר ¸ ר ¸
© σ R (i ) ¹ © σ G (i ) ¹ © σ B (i ) ¹ background frames is N .) After constructing the histogram,
(4) the thresholds are now automatically selected according to the
αi represents the variation of the brightness distortion of desired detection rate r . τ n is the normalized chromaticity
th distortion value at the detection rate of r . In brightness
i pixel, which is given by
distortion, two thresholds ( τ α 1 and τ α 2 ) are needed to define
¦ ( m − 1)
the brightness range. τ α 1 is the m i value at that detection rate,
α i = RMS (mi ) = i=0 i
N and τ α 2  i value at the (1 − r ) detection rate. τ so is a
is the m
βi represents the variation of the chromaticity distortion of lower bound for the normalized brightness distortion in order
to avoid that a dark pixel will always be misclassified as a
the i th pixel, which is given by shadow.
¦ ( n − 1)
D. The Flowchart of the Algrithm
βi = RMS (ni ) = i =0 i
(6) The algorithm is presented as follows:
B. Pixel Classification Start
A given pixel is classified into one of the four categories:
• Original background (B)
Obtain N frame images
• Shade background or shadow(S)
• Highlighted background(H) Compute Ei , si
• Moving foreground object(F)
Let Get a frame of new image
mi − 1
m i = (7)
αi Compute m i , ni
ni = (8)
βi Construct histogram then
be normalized brightness distortion and normalized determine the threshold
chromaticity distortion respectively in order to use a single τ α 1 ,τ α 2 ,τ n ,τ so
threshold for all of the pixels.
Based on these definitions, a pixel is classified into one of
the four categories B, S , F , H by the following decision If ( ni > τ n )or ( m
 i < τ so ) Y
­ F : (ni > τ n )or (m i < τ so ), else N M(i) = F
° B : (m > τ )or (m < τ ), else Y
° α2 α1 If ( m i > τ α 2 ) or ( m i < τ α 1 )
M (i ) = ® i i
° S : (mi < 0), else

N M(i) = B
°¯ H : otherwise
If ( m i < 0) Y
where τ n , τ so , τ α 1 , and τ α 2 are the threshold values used to
determine the similarities of the chromaticity and brightness N M(i) = S
between the background image and the current image.
M(i) = H
C. Automatic Treshold Selection
The appropriate thresholds are determined by a statistical
learning procedure. First, the histograms of m i and ni are
constructed. The histograms are built from combined data End
through a long sequence captured during background learning
Figure 1. Flowchart of the algorithm.
Node Access
A. Hardware Platform
In order to implement the intelligent node, the hardware
platform employs an architecture that the Digital Media
Processor and FPGA work together. Broadband
The Digital Media Processor makes use of Switch Router
TMS320DM6446 to meet the network media applications Internet
processing need. Its dual-core architecture provides benefit of
both DSP and RISC technologies, incorporating a high
performance C64x+ DSP core and an ARM9 core. Using this
dual core has the advantage of full DSP computational power
available for certain algorithm components such as moving
object detection and video compression and an ARM core for
dealing with all peripherals to fetch media data [6].
The FPGA makes use of STRATIX II EP2S130 to meet the Node
needs of preprocessing for acquired image in order to save Control
CPU cycles consumed by the algorithm of moving object Recording
detection [7]. It also has a feature of On-Screen Display which is Database
necessary for the intelligent video surveillance system [8].
Figure 3. The architecture of the intelligent system.
The structure of the hardware platform is shown in figure 2.
The intelligent nodes and the control center are organized as
local area network (LAN) by switch. The LAN is able to
NAND connect to the Internet through a broadband router. It can also
DDR CFG be accessed by an authorized remote terminal.


The intelligent node which runs a robust algorithm is able to
detect the moving object successfully in a scene which has a
Digital complex background. Figure 4 shows the result of applying the
Hard algorithm to several frames of indoor and outdoor scenes.
Media FPGA
Disk Images (a), (b) in fig.4 show an indoor scene. And images (c),
(d) in fig.4 show an outdoor scene containing some small
motions of leaves in background.

Encoder Decoder

Monitor Camera (a) (b)

Figure 2. The structure of hardware platform for the intelligent node i.

As shown above, the algorithm of moving object detection is

able to run in real-time at resolutions of 720 x 576 pixels (25
fps) due to the preprocessing operations implemented by
FPGA. The MPEG-4 compression and transmission over
Ethernet can also be performed in real-time with more
(c) (d)
considerations on code optimization.
Figure 4. The effect of motion detecting in different scene.
B. Architecture of the Intelligent System
The intelligent video surveillance system which is composed The control center is a multi-screen display system which is
of one or more nodes is shown in figure 3. able to be configured according to the number of nodes. The
moving object detected by the intelligent node is marked and more accurately and rapidly. It enables a single operator to
displayed in the control center as shown in figure 5. handle the activities over a complex area efficiently.


[1] F. Oberti, G. Ferrari, and C. S. Regazzoni. A Comparison between

Continuous and Burst, Recognition Driven Transmission Policies in
Distributed 3GSS, chapter 22, pages 267–278. Video-Based
Surveillance Systems. Kluwer Academic Publishers, Boston, 2002.
[2] Y. Dedeoglu, “Moving object detection, tracking and classification for
smart video surveillance,” PhD Thesis, Department of Computer
(a) (b) Engineering and The Institute of Engineering and Science of Bilkent
[3] A. Hampapur, L. Brown, J. Connell, “Smart video surveillance:
exploring the concept of multiscale spatiotemporal tracking,” IEEE
Signal Processing Magazine, 2005, pp. 38–41.
[4] T. Horprasert, D. Harwood, L. S. Davis, “A staistical approach for real-
time robust background subtraction and shadow detection,” IEEE
International Conference on Computer Vision, 1999.
[5] A. Elgammal, R. Duraiswami, D. Harwood, “Background and
foreground modeling using nonparametric kernel density estimation for
(c) (d) visual surveillance,” Proceedings of the IEEE, vol. 90, NO. 7, 2002, pp.
Figure 5. The controll center of the intelligent system. [6] Texas Ins. Digital Media Processor DM6446 Datasheet,
[7] K. Wiatr, “Dedicated hardware processors for a real-time image data
V. CONCLUTION pre-processing implemented in FPGA structure,” Image Analysis and
This paper presented an intelligent video surveillance Processing. Volume 1131/1997. Springer Berlin/ Heidelberg. pp. 69-76.
system with real-time moving object detection and alarm [8] T. Luo, S. Y. Yao, Z. F. Shi, “OSD design and FPGA implementation in
generation. The system is able to work successfully in many video format conversion IC,” Journal of Jilin University(Engineering
and Technology Edition) vol. 38, 2008, pp. 1452-1457
scenarios such as banks, museums and private properties.
Applications prove that the system provides the human
operator with high level data to help him make the decisions