Video Surveillance System

IEEE International Workshop on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications
21-23 September 2009, Rende (Cosenza), Italy
A Real-Time Motion Detection for Video Surveillance

System
Yuriy Kurylyak
Research Institute of Intelligent Computer Systems, Ternopil National Economic University
yuk@tneu.edu.ua
Abstract - This paper describes an approach to moving background model and compare it with the current image,
objects detection in video stream with the usage of improved so the foreground objects which appeared on the scene
background subtraction and model updating methods as well will be detected. This method is very flexible and fast,
as hierarchical data structure. The improved method of however the background as well as the camera should
background model updating allows to change dynamically
the speed of such updating depending on the average change
remain static.
of pixels’ values. Applying of the modified algorithm of Usually the running average method is used for
quadtree creation allows to speed up the performance by background updating. Also there is its modified version
comparing not all the pixels, but only the random ones and where different coefficients are used for background and
also to use the quadtree structure created on the previous foreground pixels updating [6]:
frame instead of creating from scratch. Experimental results
show that proposed method is up to 10% faster than simple
difference, up to 45% faster than running average. ⎧α * Bt −1 + (1 − α ) * I t , I t ∈ BG
Bt = ⎨ , (1)
Keywords – Intelligent Video Surveillance, Motion
⎩ β * Bt −1 + (1 − β ) * I t , I t ∈ FG
Detection, Background Subtraction, Background Model
Updating, Hierarchical Data Structure. where Bt and I t - background and foreground images at
I. INTRODUCTION the time t , and α, β – coefficients which define how fast
will be updated background and foreground pixels
In the last years a great attention from the scientists in
representatively.
computer vision area has received motion detection from
The main lack of this approach is that for each new
video stream. The reason of such attention is the possibility
frame it is necessary to compare each pixel with the
of their usage in many areas like video surveillance
corresponding background image and also to update the
systems, traffic monitoring, gesture recognition, advanced
background model. As shown on the Fig. 1, calculating of
user interfaces, sport games players tracking etc. All these
the frame difference slows down the processing speed by
areas require on the first stage to automatically identify
15-20% and updating of the background model reduces
objects, people or events that are of interest.
the system performance by 40-45% (the performance
The frames, obtained from video camera contain a lot of
calculated without displaying image on the screen). Even
redundant information. For example, in [1] were achieved
if the speed seems to be very high, it shows only the
the speed of 15-25 frames per second while detecting faces
motion segmentation without any further stages that will
in the video stream from one static camera with resolution
be applied in the complete surveillance system. So the
of 352x288 pixels. However, this speed dramatically slows
more frames it will be possible to process in a second the
down when increasing the number of faces in the frame as
better overall performance we will have.
well as connecting more cameras to the same computer.
Therefore, the real-time detection of moving objects is a
very important and actual task for the video surveillance
systems. And the main task here is a computational
complexity of the algorithm. It is not so important how
accurate will be performed segmentation if it takes all the
processor time and the processing speed is very slow.
There are a number of proposed methods in the
literature for moving object detection [2-9] and all of them
can be divided into three classes: temporal difference,
optical flow and background subtraction. The background
subtraction methods are the most commonly used with the
static cameras because of its high performance and low Fig. 1. Relation between the number of processed frames per
memory requirements [10]. It supposes to create a second and detection method.
978-1-4244-4881-4/09/$25.00 ©2009 IEEE 386

To solve this problem in [11] was proposed to use the pixel is selected within the current node. If it belongs to
hierarchical data structure based on regularly decomposed the foreground, this node is subdivided into the next level
region quadtree. The idea is to divide the input image step as in [11]. But if the pixel is classified as the background
by step for 4 subimages while the final level will be this can mean either that there are no moving objects in
reached. The root node represents the whole image this segment, or the pixel was classified incorrectly or the
(level 0) and its children correspond to equal-sized object takes only some part on the segment. In this case
quadrants (level 1) and so on (Fig. 2a). The Fig. 2b shows was proposed to select another two random pixels and if
an example of decomposition to level 3. both of them are not classified as foreground then finally
reject this segment. The validation performs according to
FG n = I n ( P1 ) > T ∨ I n ( P2 ) > T ∨ I n ( P3 ) > T , (2)
where I n ( P1 ), I n ( P2 ), I n ( P3 ) – pixel values in segment

n in random positions P1 , P2 , P3 .
a) b) Fig. 3 shows an example of such image division.
Fig. 2. An example of representing image by the quadtree [11].
However, such quadtree should be created from scratch
for each new input image and therefore requires more
computational time.
This paper focuses on the ways how to improve the
above mentioned methods of hierarchical data structure
and background updating to reduce the computational
a) b)
complexity and increase the execution time.
The rest of this paper is organized as follows: in section
II we describe the improved hierarchical data structure for
fast background subtraction. In section III we present our
approach for updating background image, based on the
statistics of the average deviation of background pixels.
Finally, section IV gives experimental results and section
V concludes.
II. IMPROVED HIERARCHICAL DATA STRUCTURE c)
In this paper we used as a basis the approach, proposed Fig. 3. Background subtraction example.
in [11]. Each input image divides into some initial level Moreover, we propose not to create a full quadtree on
linit, and then for each node a random pixel is selected with each new image, but just to modify the one created on the
the following classification as a background or previous frame. Thus, nodes where the foreground pixels
foreground. If the pixel is classified as the foreground, the were previously found will be validated directly on the
node is subdivided into the next level and the same final level, and the new nodes will be created only to the
procedure applies to each of the children. Otherwise, the pixels which were classified as background.
next node is considered. The final level lfinal of the tree as
well and the initial level linit should be selected III. BACKGROUND UPDATING METHOD
experimentally depending on the image dimensions. As was shown on Fig. 1, updating of the background
Such approach causes a rough segmentation since only model on each new image will slow system performance.
one pixel is compared from each subimage. In [11] was However, such often updating of the background model is
proposed to calculate the number of foreground and not reasonable. Therefore, we propose to update the
background pixels in each subimage and then find the background model once in a k seconds as in (3)
dependency between total number of pixels and the
foreground ones. If the node contains the significant T
amount of foreground pixels, they verified the neighbor k = round ( ), (3)
pixels for the final segmentation and detect objects’ avg _ dev
boundaries. But this requires additional processor time.
The authors already proposed in [12] a modified where T – the threshold for classification of the
algorithm of creation such quadtree that suppose the foreground/background pixels, avg_dev – the mean value
following steps in order to prevent neighbors’ validation of the difference between each foreground pixel and
and increase the detection accuracy. First, the random correspondent background one. It can be calculated as
387
m
and have an image size of 640x480 pixels. They were
∑ (I t − Bt −1 )
obtained from [14]. The third sequence was obtained from
avg _ dev = ( i =1 ), (4)
the PETS’2009 workshop database [15] and contains
m
scenes with a lot of walking people and 720x576 pixels
where m – number of detected background pixels, i.e. image size. Finally, the fourth sequence contains two
moving people and has relatively small image size -
where ( I t − Bt −1 ) < T .
352x288 pixels.
In this way, as small will be the difference between the Table I shows the average frame rates while testing on a
current and background pixels, as rarely the background test sequences the following methods: simple difference
model will be updated. When the changes will be between background and current frames without updating
significant, i.e. near the threshold value, the background background model, running average, simple difference
model will be updated on each new image as in the with quadtree [11] and proposed method. The
classical running average method.
performance calculated without displaying image on the
In the previous section we proposed to not classify each
screen. Testing of the running average method with the
pixel, but only random ones. Therefore, the avg_dev value
usage of quadtree was not performed because of its
can be calculated based only on those pixels, which were
selected and classified as the background while creation of inappropriate. The reason is that running average method
the quadtree. supposes to update all pixels on the picture, but the
Since the threshold T doesn’t change during the system purpose of quadtree is to keep the number of processed
operation so calculating (4) each time is not reasonable pixels as low as possible.
from the computational complexity point of view. It is
Table I
more efficiently to generate an array of all possible values AVERAGE PROCESSING SPEED (FRAMES PER SECOND)
depending from avg_dev at the beginning of program
Quadtree [10]
execution. Simple Running Proposed
+ simple
Updating of the background model itself will be difference Average
difference
method
handled as in (5). Video 1 95.1 59.8 104.3 108.9
Video 2 104.4 65.6 107.8 110.6
⎧ I t , I t ∈ BG Video 3 125.45 82.88 138.9 141.2
Bt = ⎨ , (5) Video 4 461.54 303.03 478.47 462.7
⎩ Bt −1 + avg _ dev , I t ∈ FG
The proposed method shows the same frames rate as the
and classification of the foreground pixels according to original quadtree, but it updates also the background
model and shows better accuracy (see Fig. 4). On both
I t − Bt > T . (6) pictures there was not applied any morphological
operation and displayed the results of segmentation only.
As a result there will be generated a binary mask. It still

contains the noise and therefore it is required to apply
such morphologic operations as dilation and erosion. It is
important also to define its order and size. The order
affects on the accuracy and the size affects both on
accuracy and computational complexity of the noise
removing stage.
Based on the experiments with different combinations,
a) b)
we found the following effective combination: dilation for
two pixels, then erosion for three pixels followed by Fig. 4. The motion segmentation results of algorithm
dilation for one pixel. In this way, the first operation will presented in [11] (a) and proposed in this paper (b).
remove the small wholes inside of the objects and will V. CONCLUSIONS
extend the boundaries. The next erosion will remove that
extra pixels and also will remove isolated noise. The last This paper describes an improved approach to real-time
step will return objects to the initial dimensions. motion detection for video surveillance systems based on
the hierarchical data structure and modified method of
IV. EXPERIMENTAL RESULTS
background model updating. It is up to 10% faster than
The proposed methods were implemented in Visual simple difference and up to 45% than running average. In
C++ with the usage of OpenCv Library [13]. Testing was comparison to the quadtree in [11], it shows the same
performed on four image sequences. First two sequences frames rate, but it updates also the background model and
contain scenes with the synthetic human models walking shows better accuracy.
388
Possible application where such system can be used on visual surveillance of object motion and behaviors,” IEEE
Transactions on Systems, Man, and Cybernetics. Part C:
includes complete video surveillance systems where the Applications and Reviews, Vol. 34, No. 3, pp 334-352, 2004.
motion detection is a first stage followed by more [5] L. Wang, W. Hu, and T. Tan. “Recent developments in human
complex stages. Since the next processing usually is very motion analysis,” Pattern Recognition, No 36(3), pp 585–601, 2003.
computationally complex, doing very fast motion [6] A. M. McIvor. “Background subtraction techniques,” Proc. of
Image and Vision Computing, Auckland, New Zealand, 2000.
detection will allow increasing of the performance of the [7] I. Haritaoglu, D. Harwood, and L.S. Davis. “W4: A real-time
whole system. system for detecting and tracking people,” Computer Vision and
Pattern Recognition, pp. 962–967, 1998.
ACKNOWLEDGMENT [8] R. T. Collins et al. “A System for video surveillance and
monitoring: VSAM final report,” Technical Report CMU-RI-TR-
The above results obtained as a part of the Ukrainian- 00-12, Robotics Institute, Carnegie Mellon University, 2000.
Bulgarian joint research project “Human Biometric [9] David J. Fleet, Yair Weiss. “Optical flow estimation,”
Identification in Video Surveillance Systems”, supported Mathematical Models in Computer Vision: The Handbook by N.
by Ministry of Education and Science of Ukraine under Paragios, Y. Chen, and O. Faugeras (editors), Chapter 15, pp. 239-
258, 2005.
agreement M/205-2009. [10] Nan Lu, Jihong Wang, Q.H. Wu, Li Yang, “An improved motion
detection method for real-time surveillance,” IAENG International
REFERENCES Journal of Computer Science, Vol. 35 Issue 1, 2008, pp. 119-128.
[1] I.Paliy, Y. Kurylyak, A. Sachenko, A. Chohra, K. Madani. Face. [11] Johnny Park, Amy Tabb and Avinash C. Kak, “Hierarchical data
“Detection on grayscale and color images using combined cascade structure for real-time background subtraction,” Proceedings of
of classifiers,” Journal of Computing, vol. 8, Issue 1, pp. 61-71, IEEE International Conference on Image Processing. – 2006.
2009. [12] Yu. Kurylyak, A. Sachenko. “Detection of moving objects in video
[2] S. Elhabian, K. El-Sayed, S. Ahmed, “Moving object detection in stream based on background subtraction method and hierarchical
spatial domain using background removal techniques – state-of- data structure,” The Bulletin of Chernivtsi National University,
art,” Recent Patents on Computer Science, 1(1), pp. 32-54, January Issue 426, pp.135-139, 2008 (in Ukrainian).
2008. [13] OpenCV Library: http://sourceforge.net/projects/opencv/.
[3] Neeti A. Ogale, “A survey of techniques for human detection from [14] European Union MUSCLE Network of Excellence (FP6-507752):
video,” University of Maryland, Technical report, 2005. http://muscle.prip.tuwien.ac.at/.
[4] Weiming Hu, Tieniu Tan, Liang Wang, Steve Maybank, “A Survey [15] PETS’2009 database: http://www.cvg.rdg.ac.uk/PETS2009/
389

Video Surveillance System

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Video Surveillance System

Uploaded by

Copyright:

Available Formats

IEEE International Workshop on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications

21-23 September 2009, Rende (Cosenza), Italy

A Real-Time Motion Detection for Video Surveillance

978-1-4244-4881-4/09/$25.00 ©2009 IEEE 386

FG n = I n ( P1 ) > T ∨ I n ( P2 ) > T ∨ I n ( P3 ) > T , (2)

where I n ( P1 ), I n ( P2 ), I n ( P3 ) – pixel values in segment

As a result there will be generated a binary mask. It still

You might also like