Professional Documents
Culture Documents
net/publication/267745967
CITATIONS READS
18 332
4 authors:
Some of the authors of this publication are also working on these related projects:
A Data Flow Middleware platform for Real-Time video analysis View project
All content following this page was uploaded by Alessia Saggese on 21 August 2015.
1 Introduction
In the last years a wide attention has been devoted to the prevention of fires,
which can generate smoke pollution, release greenhouse gases, as well as un-
intentionally degrade ecosystems. A prompt detection and then an immediate
intervention could be very important in order to save the environment or, at
least, to reduce the damages caused by the fire.
A solution to this problem can be found by analyzing visual data acquired by
surveillance cameras, and in the last years several solutions have been proposed
[3][11]. For instance, in [2] a color based approach has been used: fire pixels are
recognized by an advanced background subtraction technique and a statistical
RGB color model. In [10] such strategy is improved by a multi resolution two-
dimensional wavelet analysis, which evaluates energy variation to detect the
motion of flames, and a disorder feature to decrease the number of false positive
events. Wavelet transform has been also used in [13] for detecting the flame
flicker. However, the main limitation in this kind of approach is related to the
frame rate: in fact, for evaluating the flicker, the acquisition device should work
at least at 20 fps, and then also the algorithm for detecting events on line should
work at the same frame rate. Furthermore, a common limitation lies in the fact
that RGB color makes the proposed methods sensitive to changes in brightness
and then can cause a high number of false positive, due to the presence of
shadows or to different red colors.
2 Rosario Di Lascio, Antonio Greco, Alessia Saggese, Mario Vento
In [14] both fire and smoke are detected by color evaluation: in particular,
HSI and RGB color spaces have been used for detecting respectively fire and
smoke. A similar approach has been used in [9], where flicker detection is per-
formed by using a cumulative time derivative matrix of luminance and fire color
detection through RGB and HSV thresholding. In [1] the limitation or RGB
based approach is overcome by using YUV statistical color model to separate
the luminance from the chrominance more effectively than RGB, then reducing
the number of false positive detected by the system (from 66% [2] to 31%).
Although the promising performance in terms of accuracy of the state of the
art approaches, two main limitations can be highlighted: on one side, the number
of false positive is still too high for using such methods in real applications
[1][2][13]. On the other side, the reduction in the number of false positive is
often paid in terms of computational cost, so making critical their usage on
embedded platforms or on general purposes systems combined with other video
analysis applications [10][14].
In order to face the above mentioned problems, we propose a novel method
able to properly combine different typologies of information, respectively related
to color and motion. Color decision is evaluated in the YUV space: although
providing an high accuracy, the color evaluation is not robust with respect to
other red objects moving in a scene. On the other hand, motion decision is based
on a SIFT tracker: the rationale is that a set of given keypoints in a moving
object (such as a person or a vehicle) follows the same direction, while in the
fire their movement is much more disordered. The results are finally combined
by a multi expert system: the main advantage deriving from this choice lies in
the fact that the decision systems (color and motion based) consider different
but complementary aspects of the same decision problem, and their combination
provides better performance if compared with any single system.
2 Proposed method
The proposed algorithm is based on the YUV color space, which separates the
luminance from the chrominance and is less sensitive to changes in brightness
than the RGB color space. In particular, the four rules based on the statistical
color model proposed in [1] have been exploited. The first two rules r1 and r2 are
Improving fire detection reliability by a combination of videoanalytics 3
background
MES
ψ
classifier
Background
Colour
ψc
upda2ng
Evalua2on
Fig. 1: Overview of the proposed approach: once extracted the foreground mask,
a multi expert system is used to combine information respectively based on color
and motion.
based on the consideration that in flame pixels the Red channel value is greater
than the Green channel value, as well as the Green channel value is greater than
Blue channel value. Such consideration, transformed in the YUV color space,
becomes for the generic pixel (x,y) of the image:
The third rule r3 can be obtained by considering that the flames’ brightness
is higher than other areas of the frame. This consideration suggests that a fire
pixel has the Y and V components higher than the average Y and V value in
the frame, while the U component lower than the average U value in the frame:
N
1 X
r3 :Y (x, y) > ∗ Y (xk , yk ), (2)
N
k=1
N N
1 X 1 X
U (x, y) < ∗ U (xk , yk ), V (x, y) > ∗ V (xk , yk )
N N
k=1 k=1
In our experiments, (γ1 , γ2 , γ3 , γ4 ) have been set to (1, 1, 1, 1) for equally weigh
the considered contributions.
C A
D F
E
Dictionary D
A B C D E F
Ha 1
14
0
0
0
0
ψca: 0.06
Ma Mb ψca: 0.72
Hb 5
7
4
5
9
2
Descriptors Extraction Reliability
and Matching Descriptors Evaluation
Fig. 2: Motion evaluation: for each box, the descriptors’ matching M a and M b ,
associated respectively to the boxes a and b, are evaluated according to the
dictionary D previously defined. The occurrences of the angles H a and H b are
computed and the reliability ψca and ψcb is obtained: 0.06 for a and 0.72 for b.
Given the feature vectors Vt and Vt−1 the 1:1 matching M (Vt , Vt−1 ) is per-
formed inside each box by minimizing the distance, so that the generic match-
ing mj is given by: mj = arg min distance(vta , vt−1 b
), a = {1, ..., |Vt |}, b =
{1, ..., |Vt−1 |}. Note that the maximum size of M depends on the dimensionality
of the descriptors and then can be computed as follows: |M | ≤ min(|vt |, |vt−1 |).
For each matching mj , the angle φj associated to the movement is evalu-
m |
ated: φj = arctan mjj |xy , being mj |x and mj |y the horizontal and the vertical
component of mj , respectively. φj is then quantized according to a dictionary D
manually defined io the round into a fixed number of |D|
n by iuniformly partitioned
2π 2π
sectors: D = dk ∈ k |D| , (k + 1) |D| . |D| has been experimental set in this
paper to 6. In particular, φj is associated to the sector sj it belongs to, among
the |D| available: sj = dk |φj ∈ dk .
For each box, the angles φ = {φ1 , ..., φ|M | } are computed and its high level
representation is built by evaluating the occurrences of angles. The obtained
P|M |
vector H = {h1 , ..., h|D| } can be computed as follows: hi = m=1 δ(sm , i),
j = 1, ..., k, being δ(·) the Kronecker delta.
Finally, the reliability ψm associated to the object is evaluated as: ψm =
P|H|
1 − max(H)/ k=1 hk . It means that, as shown in Figure 2, the corner points
associated to people (box a) move approximatively in the same direction, and
the high level representation is polarized toward one or just a few angles (angle
B in the example). On the other side, the angles extracted by movement’s fire
are much more spread, so implying that the reliability is higher (0.72 against
0.06 in the example).
The information obtained by evaluating color and movement are finally combined
in an intelligent way by using a Multi Expert System (MES). In particular, the
classification reliability ψ is evaluated by a weighted voting rule which combines
ψc and ψm : ψ = (αc ∗ ψc + αm ∗ ψm )/(αc + αm ).
The weights αc and αm are dynamically evaluated during the training step,
depending on the overall reliability of the single expert module. In particular,
given the misclassification matrix C (k) computed by the expert module ek on
the training step, such values can be determined by evaluating the probability
that the pattern x under test, belonging to the class i, is assigned to the right
class by the expert module ek , being k = {c, m} [6]:
M
(k) (k)
X
αk = P (x ∈ i|ek (x) = i) = Cii / Cij , (4)
i=1
being M the number of classes (two in the proposed approach, fire and non fire)
and C(ij) the value of the misclassification matrix in the position (i, j).
Finally, the decision is taken according to a threshold β: if ψ ≥ β for at least
one box, then a fire event is detected and an alert is sent to the human operator.
6 Rosario Di Lascio, Antonio Greco, Alessia Saggese, Mario Vento
3 Experimental results
Fig. 3: A few images extracted from the videos used for testing the method: (a)
fire1, (b) fire4, (c) fire6, (d) fire13, (e) fire14, (f) fire15, (g) fire17, (h) fire21.
performance on the training set (β = 0.7). In order to further confirm the effec-
tiveness of the proposed approach, compared with state of the art ones, a deep
comparison has been performed.
The results are summarized in Figure 1. We can note that in general YUV
based approach strongly outperforms RGB based ones, both in terms of accuracy
and false positive. This consideration confirms our choice to exploit a YUV based
strategy for the evaluation of the color. Furthermore, the results obtained by the
proposed MES based on YUV and movement evaluation (accuracy = 92.59%
and false positive = 6.67%) outperforms the other considered approaches, so
confirming the effectiveness of the proposed methodology.
Finally, we also evaluate the computational cost of the proposed approach.
In particular, we used a traditional computer, equipped with an Intel dual core
processor T7300 and with a RAM of 4GB. The proposed method is able to work,
on average by considering 1CIF videos, with a frame rate of 70 frame per seconds
over the above mentioned platform, so making it especially suited for low-cost
real applications.
ROC
Curve
1
True
Posi*ve
Rate
0,8
Predicted Class
0,6
Fire No Fire
0,4
Fire 91.67% 8.33%
GT
0,2
No Fire 6.67% 93.33%
0
0
0,2
0,4
0,6
0,8
1
False
Posi*ve
Rate
Fig. 4: Results obtained by the proposed system in terms of ROC Curve, on the
left, and misclassification matrix, on the right, computed with β = 0.70.
4 Conclusions
In this paper we proposed a method for detecting fires in both indoor and outdoor
environments. The main advantage of the proposed approach lies in the fact
8 Rosario Di Lascio, Antonio Greco, Alessia Saggese, Mario Vento
that the chosen combination significantly reduces the number of false positive
detected by the system. Furthermore, the introduction of a similar application
on existing video surveillance systems only slightly improves their cost: in fact,
on one side, no additional cameras needs to be installed and the existing ones
can be still used, since the proposed method does not require an ad hoc setup.
On the other side, the obtained performance, both in terms of accuracy and
computational cost, confirms its applicability in real applications.
Acknowledgment
This research has been partially supported by A.I.Tech s.r.l. (http://www.aitech-
solutions.eu).
References
1. Celik, T., Demirel, H.: Fire detection in video sequences using a generic color
model. Fire Safety Journal 44(2), 147–158 (2009)
2. Celik, T., Demirel, H., Ozkaramanli, H., Uyguroglu, M.: Fire detection using sta-
tistical color model in video sequences. J. Vis. Comun. Image Represent. 18(2),
176–185 (Apr 2007), http://dx.doi.org/10.1016/j.jvcir.2006.12.003
3. Cetin, A.E., Dimitropoulos, K., Gouverneur, B., Grammalidis, N., Gunay, O.,
Habiboglu, Y.H., Toreyin, B.U., Verstockt, S.: Video fire detection: a review. Dig-
ital Signal Processing 23(6), 1827 – 1843 (2013)
4. Cetin, E.: Computer vision based fire detection dataset (May 2014), http://
signal.ee.bilkent.edu.tr/VisiFire/
5. Conte, D., Foggia, P., Petretta, M., Tufano, F., Vento, M.: Meeting the application
requirements of intelligent video surveillance systems in moving object detection.
In: Pattern Recognition and Image Analysis, LNCS, vol. 3687, pp. 653–662 (2005)
6. Lam, L., Suen, C.Y.: Optimal combinations of pattern classifiers. Pattern Recog-
nition Letters 16(9), 945 – 954 (1995)
7. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Com-
put. Vision 60(2), 91–110 (Nov 2004)
8. Mivia: Mivia fire detection dataset (May 2014), http://mivia.unisa.it/
9. Qi, X., Ebert, J.: A computer vision-based method for fire detection in color videos.
International Journal of Imaging 2(9 S), 22–34 (2009)
10. Rafiee, A., Tavakoli, R., Dianat, R., Abbaspour, S., Jamshidi, M.: Fire and smoke
detection using wavelet analysis and disorder characteristics. In: IEEE ICCRD.
vol. 3, pp. 262–265 (March 2011)
11. Ravichandran, A., Soatto, S.: Long-range spatio-temporal modeling of video with
application to fire detection. In: ECCV. pp. 329–342 (2012)
12. Shi, J., Tomasi, C.: Good features to track. In: IEEE CVPR. pp. 593–600 (1994)
13. Töreyin, B.U., Dedeoğlu, Y., Güdükbay, U., Çetin, A.E.: Computer vision based
method for real-time fire and flame detection. Pattern Recogn. Lett. 27(1), 49–58
(Jan 2006)
14. Yu, C., Mei, Z., Zhang, X.: A real-time video fire flame and smoke detection algo-
rithm. Procedia Engineering 62(0), 891 – 898 (2013), asia-Oceania Symposium on
Fire Science and Technology