You are on page 1of 9

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TII.2017.2757457, IEEE
Transactions on Industrial Informatics
TII-16-1176.R2

Fast Smoke Detection for Video Surveillance


using CUDA
Alexander Filonenko, Member, IEEE, Danilo Cáceres Hernández, Member, IEEE, and Kang-Hyun Jo,
Senior Member, IEEE


Abstract—Smoke detection is a key component of disaster and
accident detection. Despite the wide variety of smoke detection
methods and sensors that have been proposed, none has been
able to maintain a high frame rate while improving detection
performance. In this paper, a smoke detection method for
surveillance cameras is presented that relies on shape features of
smoke regions as well as color information. The method takes
advantage of the use of a stationary camera by using a
background subtraction method to detect changes in the scene.
The color of the smoke is used to assess the probability that pixels
in the scene belong to a smoke region. Due to the variable density
of the smoke, not all pixels of the actual smoke area appear in the
foreground mask. These separate pixels are united by
morphological operations and connected-component labeling
methods. The existence of a smoke region is confirmed by
analyzing the roughness of its boundary. The final step of the
algorithm is to check the density of edge pixels within a region.
Comparison of objects in the current and previous frames is
conducted to distinguish fluid smoke regions from rigid moving
objects. Some parts of the algorithm were boosted by means of
parallel processing using CUDA GPUs, thereby enabling fast
processing of both low-resolution and high-definition videos. The
algorithm was tested on multiple video sequences and
demonstrated appropriate processing time for a realistic range of
frame sizes.
Fig. 1. Smoke detection algorithm. The gray background area represents the
steps performed using CUDA.
Index Terms— Boundary roughness, color probability, CUDA,
edge density, GPGPU, smoke detection.
emit pillars of smoke that occupy larger volumes. In such
cases, an accident can be detected even if the source of the fire
I. INTRODUCTION is hidden behind another object such as a fence.
Cameras are not the only means of smoke detection. Sensor
P RESENTLY , there is a noticeable demand for automatic
smoke detection systems that work quickly while requiring
low maintenance costs. These surveillance systems are utilized
nodes, used within wired or wireless networks, are able to
detect temperature changes and poisonous gas concentrations
without requiring light sources. When sensor nodes are
for smoke detection itself or for early warning of fires. In the
equipped with batteries and ZigBee wireless communication
latter case, flames may not appear in front of the camera
modules, they can be placed quite far from each other and can
during the first moments after ignition, but burning materials
cover vast areas. However, this type of configuration requires
regular maintenance, and cannot be used to survey whole
Manuscript received July 2, 2015; revised October 31, 2015, March 10, territories because of the discrete nature of the sensor nodes’
2016, October 19, 2016, July 27, 2017, and April 27, 2017; accepted for
publication September 6, 2017. This work was supported by the National
spatial distribution.
Research Foundation of Korea (NRF) Grant funded by the Korean Cameras require less maintenance and can survey the whole
Government (2016R1D1A1A02937579). region in their field of view; however, general-purpose
Alexander Filonenko is with the Graduate School of Electrical
Engineering, University of Ulsan, Ulsan, Korea (e-mail: alexander@
surveillance cameras fail to provide meaningful data at night
islab.ulsan.ac.kr). due to poor noise performance and lack of color information.
Danilo Cáceres Hernández is with the Electrical Deparment, Universidad To cope with this issue, Torabnezhad et al. have proposed
Tecnológica de Panamá (UTP), Panamá, Panamá (e-mail: another way to detect smoke by utilizing infrared (IR) images,
danilo.caceres@utp.ac.pa).
Kang-Hyun Jo is with the School of Electrical Engineering, University of which allow distinguishing of smoke from other smoke-like
Ulsan, Ulsan, Korea (e-mail: acejo@ulsan.ac.kr).

1551-3203 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TII.2017.2757457, IEEE
Transactions on Industrial Informatics
TII-16-1176.R2

regions with higher precision [1]; however, this method relies


on color information and thus does not solve the problem of
smoke detection at night. The high cost of IR cameras also
affects the ability to build cost-effective systems based on
such methods.
The problem discussed in this paper is the detection of
smoke that appears in front of a stationary camera. Smoke can Fig. 2. Example images belonging to a training dataset.
be the early precursor of fire, and its rapid detection may
decrease the harm that a fire will cause. Because inexpensive sequences are based upon the idea that, when using fixed
surveillance systems tend to use low-resolution cameras, the cameras, smoke can be distinguished from a background. The
algorithm should work fast and should be able to detect smoke advantage of this approach is that there is no need to consider
from low-resolution video data. However, cheap high- static objects that have similar color characteristics, shape, etc.
definition (HD) cameras are presently emerging whose data For example, in one research work candidate regions were
cannot be processed rapidly by means of existing algorithms extracted by motion features, and candidates were then
for surveillance tasks. analyzed in spatial, temporal, and spatiotemporal domains [2];
One popular way to improve processing speed is to use in this case, the authors sacrificed detection rate (83.5%) to
general-purpose graphics processing unit (GPGPU) cores, one decrease the false positive rate (0.1%) significantly. The
type of which is Nvidia Compute Unified Device Architecture reaction time of the system was as slow as 1.34 s. The
(CUDA) cores. These cores allow parallel processing of approaches of Toreyin et al. [3] and Yuan [4] also exceeded 1
massive amounts of data. The discrete nature of the images s response time. As noted above, Torabnezhad et al. proposed
makes it possible to process each pixel separately by means of another method that used image fusion to detect smoke
a variety of algorithms. In any combination of sensors, the regions; specifically, visual and thermal data were combined
color camera remains the main component of the system. to improve the rate of fire detection [1]. The smoke was not
This paper presents the hybrid approach of utilizing CPU visible by the Long Wavelength Infrared (LWIR) camera, and
and GPGPU in one algorithm where steps that benefit from this fact was used to distinguish smoke from smoke-like
parallelization are implemented using CUDA. In our objects. A potential smoke mask was constructed based on the
comparison to the implementation of the proposed method on visual and LWIR images. The main demerit of the approach
CPU only we show that the hybrid approach applied to HD proposed by Torabnezhad et al. is the expensive hardware
videos allows keeping processing time less than 200 ms which used and the limited distance of the smoke detection. Analysis
is appropriate for surveillance systems while CPU only way of the way that objects disappear from IR images once they
provided less than two frames per second performance. A state are covered by smoke could lead to successful smoke
of the art method explained in [12] is reimplemented by us to detection in low-light conditions. However, this technique has
provide a direct comparison using the same hardware and not been implemented yet. The use of a stationary camera
data. Results show that approach in [12] requires more made it possible to detect smoke by means of a background
processing time than both hybrid and CPU only way. subtraction technique using the data acquired by the single
Therefore, this manuscript is focused on improving the camera sensor. Filonenko et al. [5] could not achieve fast
performance of processing data from the camera. The method performance using a single CPU. Chen et al. [6] used wavelet
proposed in this work achieved data processing speed of up to transform to analyze the frequency characteristics of smoke.
61 frames per second (fps) for 320×240 video sequences and Maruta et al. [7] used the AdaBoost algorithm and local binary
more than 5 fps for HD video. patterns which they improved; according to results they
The contributions of the present work are as follows: presented, smoke could be detected, but they did not mention
• The proposed combination of CPU and GPGPU how fast the algorithm is. Background subtraction was the
processing is presently the fastest computer-based essential part of methods proposed in two other papers [8], [9].
implementation of smoke detection for surveillance It is also possible to produce a specialized device for this
cameras; specific task. Li Jinghong et al. designed a field-
• The performance of the CPU+GPGPU implementation programmable gate array (FPGA) to decrease processing time
proposed herein drops slower than a CPU-only [10]. A recent FPGA design [11] uses a fuzzy neural network
implementation when the frame size is increased; to decide whether there is a fire accident based on data from
The rest of this paper is structured as follows. Section II the smoke sensor. Unfortunately, using a FPGA limits the
discusses related works. Section III explains the proposed diversity of the possible applications of the hardware. Modern
method. Section IV illustrates how some parts of the algorithm video surveillance systems are starting to use HD cameras, for
were accelerated by using CUDA. Results are presented in which further optimizations in the implementation need to be
Section V. Finally, Section VI presents conclusions and carried out.
discusses future plans. In a recent state-of-the-art paper [12], the authors used
dynamic analysis of candidate smoke, achieving fast
II. RELATED WORKS processing performance for 320×240 resolution video
Most methods used for smoke detection from video sequences while causing no false alarms in smoke detection.

1551-3203 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TII.2017.2757457, IEEE
Transactions on Industrial Informatics
TII-16-1176.R2

In this approach, input frames should be analyzed in the time


domain so that real detection events can be confirmed by
observing that the number of times smoke is detected remains
high for a few seconds. As a detector, a modified version of
AdaBoost was applied in which the polarity of the weak
classifier was changed by adding one more threshold. The
algorithm presented in [12] cannot detect all parts of smoke in
the scene due to high variations in smoke characteristics such
as color and density. However, the fact that smoke appeared in
a video sequence was not missed. Detection occurred with
some delay owing to the use of dynamic analysis of candidate
smoke. Smoke can be white or dark depending on the source
of the fire; the double-thresholded AdaBoost detector can
successfully detect both of smoke types. Another merit of the
approach reported in [12] is that most of the parameters have
been chosen and adjusted automatically. Fig. 3. Probability density function distributions for red, green, and blue
channels of the training dataset.
III. SMOKE DETECTION ALGORITHM
This section discusses the method of smoke detection using
a stationary camera. The main steps of the method are
illustrated in Fig. 1. The steps represented by rectangles placed
on the gray background can be processed in parallel. The
parallel implementation using CUDA will be discussed in
Section IV. The camera is mounted in a stationary position;
therefore, it is possible to separate the background and
foreground. Smoke will appear in the foreground owing to its
dynamic nature.
A. Background Subtraction
There are many possible ways to separate the foreground
and the background. At first, we used the modern approach
named the Self-Balanced Sensitivity Segmenter (SuBSENSE)
[13]. Thanks to the update strategy and neighbor-spread rules
that are used to update the background model, this method is Fig. 4. Probability density function distribution for the saturation channel of
the training dataset (scaled to [0,255]).
resistant to intermittent object motion and camera shaking.
Zoran Zivkovic et al. guaranteed "90 frames per second on the smoke were connected to form one region and holes were
entire 2012 CDnet dataset", which is faster than real-time filled as shown in Figs. 5c and 6c.
processing performance, and also provided source code.
However, in experiments in the present work it took 280.683 C. Color Probability
ms to process a single HD-resolution frame using an Intel Smoke and non-smoke regions can be distinguished by the
Core i7 870 CPU and applying all optimizations of Microsoft color characteristics of moving objects. A smoke training
Visual Studio 2013. Thus, for the background subtraction, a dataset was created by using actual smoke regions to depict
fast yet reliable approach [14] was instead applied in this smoke color characteristics. Some samples of such a dataset
research. A parallel implementation of this approach exists in are shown in Fig. 2. For all the pixels of the dataset,
standard OpenCV libraries, which makes it easy to reproduce probability density functions (PDF) of normal distributions
the method in different configurations. The background model were computed for the red (R), green (G), and blue (B)
needs some time to be constructed and adjusted. Thus, the first channels of the RGB color space (Fig. 3).
frames of a video sequence may give incorrect segmentation It was noted that in most scenarios the smoke is poorly
results. saturated. The PDF for the saturation (S) channel of the HSV
color space was also calculated (Fig. 4). Mean and standard
B. Foreground Mask Refining deviation values were obtained for each PDF. For example, µR
As the result of the background subtraction, a foreground and σR are the mean and standard deviation values for the red
mask was obtained as shown in Figs. 5b and 6b. Many parts channel, respectively.
belonging to the same smoke cloud were separated in the The probability that the current pixel belongs to a smoke
mask. To unite them into a single blob, two morphology region according to the ith channel distribution is represented
operations should be applied to the foreground mask: opening by (1):
and closing. As a result, most of the separate parts of the

1551-3203 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TII.2017.2757457, IEEE
Transactions on Industrial Informatics
TII-16-1176.R2

Fig. 5. Smoke detection example for ‘Dry_leaf_smoke_02.avi’ (V2): (a)


input image; (b) background subtraction; (c) morphology transformations; (d) Fig. 6. Smoke detection example for ‘sWasteBasket.avi’ (V6): (a) input
color probability of the red channel; (e) binarized intersection of image; (b) background subtraction; (c) morphology transformations; (d) color
probabilities; (f) edges; (g) result image. probability of the red channel; (e) binarized intersection of probabilities; (f)
edges; (g) result image.
  I i ( x, y )   i  2  morphology operations, should be united into blobs by means
Pi ( x, y )  exp   , (1) of the connected-component labeling (CCL) method. To
 2 2
 calculate the number of pixels belonging to each blob with
 i 
high color probability, it is required to binarize P I as follows.
where x and y are the coordinates of the pixel within the
image, i is one of the four channels of the current pixel of the 1, if PI ( x, y )  
original image I(x,y), and µi and σi are the mean and standard
B ( x, y )   (3)
deviation of the current channel of the training dataset,
0, otherwise
respectively.
To combine the color probabilities of the four channels, After the binarized color probability is acquired, its density
these channels should be unified. Since all 4 channels (R, G, is calculated for each blob; for each blob a density of pixels
B, and S) cannot be considered independently, the intersection with high color probability is obtained, and then areas
of the probabilities of the 4 "events" is calculated according to containing less than 25% of this kind of pixels are deleted and
(2) as shown in Figs. 5e and 6e. not considered in further steps.
Artificial objects (e.g. vehicles) may appear that have
similar color characteristics as smoke. Usually, their shape
PI ( x, y)  
i{ R ,G , B , S }
Pi ( x, y). (2) will be similar to a convex hull whereas smoke regions will be
of random shape. A feature called boundary roughness can
distinguish real smoke regions from such artificial objects
For further analysis, foreground pixels, refined by [15]:

1551-3203 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TII.2017.2757457, IEEE
Transactions on Industrial Informatics
TII-16-1176.R2

Pb
RB  , (4)
PCH b

where Pb is the perimeter of the blob and PCHb is the perimeter


of the convex hull of the same blob. The perimeter of a blob is
the sum of boundary pixels that are connected vertically and
horizontally plus the square root of 2 times the number of
boundary pixels that are connected diagonally. Blobs with low Fig. 7. CUDA structure: (a) hardware; (b) CPU and GPU cooperation.
roughness are removed by the following equation:

B(m, n)  0, if RB   (5)

where β is a threshold and m and n are the coordinates of


pixels of the currently considered blob.
Some moving objects with random shapes may have colors
close to the one in the training dataset. Smoke regions usually
contain many edges. The edge density is another mean
whereby smoke and rigid objects can be distinguished:

Ne
DE  , (6)
Nb

where Ne is the number of edge pixels within the blob (Figs.


5f and 6f) and Nb is the total number of pixels in the blob. The
following rule is then applied:

B(m, n)  0, if DE   ,
(7)

where B(m,n) is the image obtained by applying (5), (m,n) are


the coordinates of pixels of the currently considered blob, and
ξ is the threshold value. Fig. 8. The processing time of ‘sWasteBasket.avi’ (V6) at different
resolutions when all processing steps are performed. “GPU pure” refers to the
The last step of the algorithm takes advantage of the processing of only the GPU steps indicated in Fig. 1. Results for [12] were
sequential nature of video data. Blobs that remain in the obtained by our implementation.
current frame are compared to their closest neighbors in the
perform most operations in the shared memory and then to
previous frame. If their areas differ, then the blob is
store the final result in the global memory. The CPU and the
considered to represent a real smoke region. Otherwise, it is
GPU cannot share data directly, and the memory of the CPU
taken to be another static object that should not be considered
(host) part is not available to the GPU. Data should be first
further. Result images of this process are shown in Figs. 5g
sent to the GPU global memory to be processed (step 1 in Fig.
and 6g.
7b). After this step, data can be read by the streaming
processors and stored in the shared memory. After the
IV. PARALLEL IMPLEMENTATION
calculations are done (step 4 in Fig. 7b), the result should be
stored again in the global memory (step 3 in Fig. 7b) and then
In general, a GPGPU consists of many streaming sent back to the host (step 2 in Fig. 7b). The program that runs
processors (SP in Fig. 7a) united in blocks that use fast shared on the GPU is called the kernel. In a GPGPU scheme, the
memory. In modern GPUs, there are many CUDA cores same kernel processes different data in parallel.
(streaming processors) available. For example, the Nvidia In the OpenCV 3.0 library, many algorithms are already
GeForce GTX 980M can use 1536 cores. The blocks form implemented on GPUs. The background subtraction
streaming multiprocessors that read and write data to the implementation and Canny edge detector used in the current
global memory available to any block of the GPU. The global work can be directly retrieved from the OpenCV. The same is
memory has an enormous amount of space compared to the true for morphology transformations. The computation of
shared memory, but read/write operations to and from global color probability, however, is a specific task which should be
memory are expensive in terms of time. It is a good practice to implemented using CUDA C++ directly. According to the first

1551-3203 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TII.2017.2757457, IEEE
Transactions on Industrial Informatics
TII-16-1176.R2
TABLE II
TABLE I DETECTION PERFORMANCE
COMPARISON OF PROCESSING TIME FOR CONNECTED-COMPONENT LABELING
ON CPU AND GPU FOR VIDEO V6 ‘SWASTEBASKET.AVI’ (MS) Video P R A F

1.0000 / 0.7997 / 0.7997 / 0.8887 /


Resolution CPU GPU V1
0.9748 0.9910 0.9662 0.9828
320 × 240 5.1715 0.9839 0.9970 / 0.9476 / 0.9449 / 0.9717 /
V2
1280 × 720 54.1079 5.8168 1.000 0.9322 0.9335 0.9649
0.9370 / 0.9640 / 0.9053 / 0.9503 /
V3
0.2832 0.3200 0.1768 0.3005
data column of Table III, representing cases in which only the 0.9915 / 0.7760 / 0.7709 / 0.8706 /
V4
CPU was utilized, the calculation of color probability was the 0.8636 0.5767 0.5285 0.6915
most time-consuming step. In preparing the color probability 0.9769 / 0.8033 / 0.7883 /
V5 0.8816 / -
binary image, every pixel can be processed independently. 0.0000 0.0000 0.0000
Memory transitions are expensive. Therefore, all four color V6
0.9749 / 0.9323 / 0.9151 / 0.9531 /
0.9640 0.2799 0.3080 0.4338
probability values for a pixel were calculated in a single
0.8532 / 1.000 / 0.8532 /
thread. Furthermore, the intersection of probabilities of four V7
0.0000 0.0000 0.0465
0.9208 / -
“events” and binarization were also included in the thread
P is precision; R is recall; A is accuracy; F is F-measure. The first value in
mentioned above. In this case only 5 reading (4 channels and each cell is the result of the proposed method; the second value is the result of
training data) and 1 writing global memory operations were our implementation of [12]
performed for each frame. The CUDA kernel for calculation
of color probability is summarized in Algorithm 1. TABLE III
AVERAGE PROCESSING TIME OF A SINGLE FRAME OF V2
‘DRY_LEAF_SMOKE_02.AVI’ FOR EACH PROCESSING STEP (MS)
Data: Red (R), green (G), blue (B), saturation (S), training
data Process 320x240 CPU 320x240 Hybrid
Result: Binarized intersection
Start thread for each pixel position; Background subtraction 10.2086 1.7377
Read R, G, B, S, and training data; Morphology transformations 0.8785 4.9470
Calculate color probability (R, G, B, S); Color probability calculation 15.7245 1.3468
Calculate intersection of probabilities; Labeling 1.1860 1.7617
Edge detection 1.9202 2.0147
if intersection > threshold then
Color probability filtering 0.9113 0.9113
result = 1;
Boundary roughness 2.7419 2.7419
else
Edge density calculation 1.0604 1.0604
result = 0;
Filtering by area 0.9373 0.9373
end
Total 35.5688 17.4587
Write result to global memory.
Algorithm 1: CUDA kernel for color probability calculation
TABLE IV
AVERAGE PROCESSING TIME OF A SINGLE FRAME OF EACH VIDEO ON CPU
The processing time for connected-component labeling on ONLY AND IN HYBRID CPU+GPU MODE AT 320 × 240 RESOLUTION (MS)
CPU and GPU is compared in Table I.
Video Hybrid
CPU [12]
Name CPU+GPU
V. EXPERIMENTAL RESULTS V1 27.6085 16.3043 73.2121
The proposed method was evaluated using a computer with V2 35.5688 17.4587 62.2525
the following hardware: Intel Core i7-4720HQ CPU, Nvidia V3 53.1262 21.3152 63.3366
Geforce GTX 980M GPU, 16 GB DDR3 RAM. All tests were V4 73.1279 27.6069 67.7624
V5 30.9237 18.0923 69.4455
performed in Windows 10 x64. The software was written in
V6 31.5653 17.3061 67.6260
Microsoft Visual Studio Professional 2013. Visual C++ was V7 56.7668 22.0233 66.5649
used for the CPU implementation, and CUDA C++ was V8 5.9428 2.4286 71.7168
utilized for the GPU implementation. As noted in Section IV, V9 12.6175 7.1754 71.6459
the OpenCV 3.0 library was used for basic operations and for
parallel implementation of some steps. Nine publicly among all the videos in the dataset (Table IV).
accessible video sequences were used for the tests [17], [18]: Algorithm parameters were tuned heuristically once and
Cotton_rope_smoke_04.avi, Dry_leaf_smoke_02.avi, were then kept the same for all tests. The tuned parameters
sBtFence2.avi, sMoky.avi, sParkingLot.avi, sWasteBasket.avi, were as follows: color probability binarization threshold λ =
sWindow.avi, CarLights1.avi, and Traffic_1000.avi; for 0.5; edge density threshold ξ = 0.03; boundary roughness
simplicity, these video files are respectively termed V1–V9 threshold β = 1.3. Processing time was evaluated by retrieving
hereinafter. Videos V8 and V9 do not contain smoke, and thus the current system time for each step of the algorithm by
most frames of these videos failed the color probability test means of the std::chrono::system_clock::now() method. The
and were skipped that lead to the lower processing time inclusion of 1000 training smoke regions was enough for the

1551-3203 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TII.2017.2757457, IEEE
Transactions on Industrial Informatics
TII-16-1176.R2

Fig. 9. Results of detection of the proposed approach (second row) and our implementation of [12] (third row) on videos V1 to V7.

Fig. 10. Smoke detection result at night


mean and standard deviation to reach a steady state for each
color channel.
The first 150 frames of each video sequence were used to
build a background model that allowed to utilize them for
color probability training data additionally to smoke regions Fig. 11. Processing time using hybrid CPU+GPU method for each frame of
retrieved from photos on the Internet. This approach allowed λ video ‘sWasteBasket.avi’ at resolution 1920x1080.
to be increased from 0.3 to 0.5, thereby decreasing the
Unlike the results in [5], where calculation of color
possible number of false positives.
probabilities for the same video file took 40 ms, the proposed
All sequences were recorded at low resolution. To test the
GPU implementation avoided bottlenecking. The GPGPU
processing time for HD videos, video V6 was resized to
cores were clocked at lower frequencies than the CPU cores,
1280×720 and 1920×1080 resolutions.
but the speed of color probability computation was still
A variety of video files were tested. Poor recording quality
improved because each pixel can be processed independently
brought distortion and flickering that decreased the
and in parallel. The presently proposed approach was
performance of background subtraction. Table II summarizes
evaluated against the most recent one [12], the report of which
the detection performance for the proposed method and our
unfortunately did not clearly mention the detection
reimplementation of [12]. The accuracy of the algorithm
performance. Yuan et al. achieved 25 frames per second for
applied to three videos achieved 90%. The other four more
320×240 video sequences [18], less than half the speed
challenging videos regarding image quality, the size of smoke
achieved by presently proposed approach, but different
regions, and noise showed 80% accuracy. The better the
hardware and software setups were used in their experiments.
recording quality, the higher the detection rate will be. Thus,
Method from [12] was implemented for this work to compare
the detection performance should be further improved, but the
results with same hardware and software setup mentioned
main goal of this manuscript is to find a way to process data
above. In the present work, memory transitions were taken
quickly.
into account, because delays incurred by transferring vectors
As shown in Table III, the method presented in this paper
from the host to device (GPGPU) affect the performance. The
was able to achieve processing of about 57 fps utilizing a
processing time for HD video can be considered to be fast
combination of CPU and GPU for the 320×240 video
enough for surveillance tasks (Fig. 8). The performance of 5.6
sequence V2. The term “hybrid” in Table III means utilization
fps for video of 1280×720 resolution should be enough to
of GPU for some image processing steps according to Fig. 1.
track the behavior of smoke because smoke fluid moves

1551-3203 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TII.2017.2757457, IEEE
Transactions on Industrial Informatics
TII-16-1176.R2

slowly. With the larger frame size also arises the possibility to because not all processing steps were performed for each
detect smaller smoke regions. From Fig. 8, it is also clear that frame. Some parts of the algorithm were skipped if no blobs
increasing frame resolution affects the processing time of the left to be processed after intermediate steps. This behavior can
CPU portion more than the GPGPU portion. For small image be observed in Fig. 11: processing time for the frame 151 is 3
size, the difference in processing time is not great for the CPU times higher than time consumed for the frame 860.
and GPU implementations. When the system needs to process
higher resolutions, serial processing reveals its weakness. VI. CONCLUSION
From Table I, it is clear that for the HD frame size, processing This paper presents a smoke detection algorithm for video
of connected-component labeling on the CPU occupies the surveillance. A single CPU is not able to rapidly process HD
system for a period similar to all steps of the proposed method video sequences. Processing of the most time-consuming parts
performed using GPGPU. It is difficult to directly use results should be carried out by specialized devices like FPGAs, or by
reported in [12] for comparison because Yuan et al. evaluated GPGPU. This work was based upon the use of GPGPU. The
whether any smoke appeared in an image rather than checking implementation of the proposed method could achieve time
whether all smoke blobs were detected correctly. The training performance for HD-resolution video which is appropriate for
dataset consisting of sets 1 to 4 mentioned on page 855 of [12] video surveillance tasks. After tests were conducted for
and presented in [18] was used to train AdaBoost. While multiple datasets, it became clear that the method is sensitive
mostly the whole smoke region was detected according to the to noise in the video. The best environmental condition for the
results published in [12], in our implementation results, in proposed algorithm is indoors with artificial light. Comparison
many successful detections, a small part of the smoke cloud of processing time to a recent paper [12] have shown that the
was detected as it is shown in Fig. 9. Unlike the proposed proposed approach outperforms the state-of-the-art work by
approach, the patch-based approach by Yuan et al. does not processing frames four times faster when HD resolution is
reveal the actual shape of the smoke region. used. The algorithm in [12] is even more sensitive to the low
By checking the data in training dataset [18], we found quality of video input, and it could not acquire true positive
there the samples cropped from V2 while there was not a results on V5 and V7 videos.
single appearance of a part of V6. Results for In future work, the most recent background subtraction
reimplementation of [12] in Table II for videos V2 and V6 method will be implemented in parallel to achieve noise-
illustrate that AdaBoost should be trained for all specific resistant detection. Boundary roughness and edge density can
scenarios that may appear. It can be concluded that the method be calculated in the parallel CPU threads. An interesting
from [12] works well for scenarios for which it was approach of combination of threading building blocks (TBB)
specifically trained while its performance drops much in the and CUDA will be applied in the way explained in [20].
new scenarios. Detection results of the proposed approach Intelligent classifiers will replace the simple step-by-step
(Fig. 9, second row) demonstrate that not the whole volume of smoke detection, e.g. support vector machine and AdaBoost in
smoke can be detected, but the essential part of it is the case if it will be possible to improve accuracy by
recognized. preserving fast processing. We will look for the way to adjust
Figure 11 shows the processing time of each frame for the parameters for color probability using automatic methods, e. g.
entire video V6. Hardware initialization at the first frame took fuzzy logic.
some time. The low processing time between frames 1 and
150 corresponds to the background subtraction training stage.
The processing time is not stable because it depends on the REFERENCES
number and size of objects that look and behave like smoke. [1] M. Torabnezhad, A. Aghagolzadeh, H. Seyedarabi, “Visible and IR
The current version of the algorithm cannot efficiently deal image fusion algorithm for short range smoke detection,” in Proc.
with night time smoke detection. For example, when ICRoM, 2013, Tehran.
analyzing a video of night time smoke [19], the background [2] C.-Y. Lee, C.-T. Lin, C.-T. Hong, and M.-T. Su, “Smoke detection using
spatial and temporal analyses,” International Journal of Innovative
subtraction algorithm failed to successfully separate the real Computing, vol. 8, no. 6, pp. 1-23, Jun., 2012.
smoke cloud and noise (Fig. 10) because the system was [3] B. U. Toreyin, Y. Dedeoglu, A. E. Cetin, “Contour based smoke
trained to recognize the smoke color characteristics from detection in video using wavelets,” in Proc. EUSIPCO, 2006, Florence.
[4] F. Yuan, “A fast accumulative motion orientation model based on
daytime images. In V5, the smoke is sparse and located far integral image for video smoke detection,” Pattern Recognition
from the camera, and compression artifacts introduce much Letters, vol. 29, no. 7, May, 2008.
noise into the frame. Some parts of the smoke region appeared [5] A. Filonenko, D. C. Hernández, and K.-H. Jo, “Smoke Detection for
Static Cameras,” in Proc. FCV, Mokpo, 2015, pp. 1-5.
in the foreground mask, but they were removed by [6] J. Chen, Y. Wang, Y.Tian, T. Huang, “Wavelet based smoke detection
morphology operations. If a more stable background method with RGB Contrast-image and shape constrain,” in Proc. VCIP,
subtraction method was utilized, then the detection results Kuching, 2013.
[7] H. Maruta, Y. Iida, F. Kurokawa, “Anisotropic LBP descriptors for
should be better. Table IV lists the average processing time for robust smoke detection,” in Proc. IECON, Vienna, 2013.
each smoke video at original resolution 320×240. Processing [8] Y. Zhao, W. Lu, Y. Zheng, J. Wang, “An early smoke detection system
time of the method from [12] remain similar for all video files based on increment of optical flow residual,” in Proc. ICMLC, Xian,
2012.
and mostly depend on image resolution rather that complexity
of the scene. Values for V6 are lower than presented in Fig. 8

1551-3203 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TII.2017.2757457, IEEE
Transactions on Industrial Informatics
TII-16-1176.R2

[9] H. Tian, W. Li, L. Wang, P. Ogunbona, “A novel video-based smoke Danilo Cáceres Hernández (M’13)
detection method using image separation,” in Proc. ICME, Melbourne,
received the Bachelor’s degree in
2012.
[10] L. Jinghong, Z. Xiaohui, W. Lu, “The design and implementation of fire Electrical and Electronic Engineering
smoke detection system based on FPGA,” in Proc. CCDC, Taiyuan, from the UTP, in 2004. He received his
2012. Master of Science in Electrical
[11] G.P. Rashmi, L. Nirmala, “FPGA Based FNN for Accidental Fire
Engineering (2011) and the Ph. D. in
Alarming System in a Smart Room,” International Journal of
Advanced Research in Computer and Communication Electrical Engineering (2017) from the
Engineering, vol. 3, no. 6, pp. 6902-6906, Jun., 2014. University of Ulsan, Ulsan, South
[12] F. Yuan, Z. Fang, S. Wu, Y. Yang, Y. Fang, “Real-time image smoke Korea.
detection using staircase searching-based dual threshold AdaBoost and
He is currently at the Electrical Department, Universidad
dynamic analysis,” IET Image Processing, vol. 9, no. 10, pp. 849-856,
Sep., 2015. Tecnológica de Panamá, Panamá, as a full time professor.
[13] Z. Zivkovic, F. v.-d. Heijdenb, “SuBSENSE: a universal change
detection method with local adaptive sensitivity,” IEEE Transactions
on Image Processing, vol. 24, no. 1, pp. 359-373, Jan., 2015.
[14] A. B. Godbehere, A. Matsukawa, K. Goldberg, “Visual Tracking of
Human Visitors under Variable-Lighting Conditions for a Responsive
Audio Art Installation”, In Proc. American Control Conference,
Montreal, June 2012
[15] P. Vinicius, K. Borges, E. Izquierdo, “A probabilistic approach for
vision-based fire detection in videos,” IEEE Transactions on Circuits
and Systems for Video Technology, vol. 20, no. 5, pp. 721-731,
Mar., 2010.
[16] O. Kalentev, A. Rai, S. Kemnitz, R. Schneider, “Connected component
labeling on a 2D grid using CUDA,” Journal of Parallel and Distributed
Computing, vol. 71, no. 4, pp. 615-620, Apr., 2011.
[17] Sample Fire and Smoke Video Clips,
http://signal.ee.bilkent.edu.tr/VisiFire/Demo/SampleClips.html
[18] Feiniu Yuan’s Video smoke detection,
http://staff.ustc.edu.cn/~yfn/vsd.html
[19] Factory Smoke at Night, https://youtu.be/UXbClEYSxsc
[20] F. Wang, X. Jiang, and X. P. Hu, “A TBB-CUDA Implementation for
Background Removal in a Video-Based Fire Detection System,”
Mathematical Problems in Engineering, vol. 2014, pp. 1-6, Mar., 2014.

Alexander Filonenko (S’14–M’17)


received the Specialist degree from Kang-Hyun Jo (M’96-SM’16) received
Moscow State Technical University the Ph.D. degree in computer controlled
"MAMI", Moscow, Russian Federation, machinery from Osaka University,
in 2011. He is currently working toward Japan, in 1997. After a year’s experience
the Ph.D. degree in electrical at ETRI as a postdoctoral research
engineering at the Graduate School of fellow, he joined the School of Electrical
Electrical Engineering, University of Engineering, University of Ulsan, Ulsan,
Ulsan, Ulsan, Korea. His research Korea. He has served as a director or an
interests include computer vision, AdCom member of Institute of Control,
robotics, sensor networks, and parallel Robotics and Systems, The Society of Instrument and Control
computing. Engineers, and IEEE IES Technical Committee on Human
Factors Chair. He has also been involved in organizing many
international conferences such as International Workshop on
Frontiers of Computer Vision, International Conference on
Intelligent Computation, International Conference on
Industrial Technology, International Conference on Human
System Interactions, and Annual Conference of the IEEE
Industrial Electronics Society. Currently, he is serving as
Administrative Committee of the IEEE IES, and he is an
Editorial Board Member for international journals, such as the
International Journal of Control, Automation, and Systems and
the Transactions on Computational Collective Intelligence.
His research interests include computer vision, robotics,
autonomous vehicle, and ambient intelligence.

1551-3203 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

You might also like