You are on page 1of 4

JOURNAL OF COMPUTING, VOLUME 2, ISSUE 12, DECEMBER 2010, ISSN 2151-9617

HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 124

Objective Performance Evaluation of Feature


Based Video Fusion
Anjali Malviya, S. G. Bhirud

Abstract— An objective fusion measure should extract all the perceptually important information that exists in the input images and meas-
ure the ability of the fusion process to transfer as accurately as possible this information into the output image. Most fusion evaluation algo-
rithms dealing with still image fusion explicitly aim at achieving optimal accuracy of representing spatial information, from the inputs in the
fused image. This does not preclude their use in video fusion as they can be applied to each multi-sensor frame independently. This paper
deals with the objective evaluation of multi-sensor video fusion. For this purpose an established static image fusion evaluation framework,
based on only edge information and not regional information, is used. The metric reflects the quality of visual information obtained from the
fusion of input images and we use it to compare the performance of the feature-level video fusion with pixel-level video fusion algorithms.

Index Terms—Video fusion; performance analysis;

——————————  ——————————

1 INTRODUCTION

T HERE is a need for robust methods of evaluating the


results of dynamic image fusion and comparing the
performance of various algorithms. An issue of fur-
tween all corresponding 8×8 blocks across input and
fused images. Information theoretic measures based on
global image statistics such as entropy and mutual infor-
ther interest is that of adaptive fusion. Recently, schemes mation have also been considered within the context of
have emerged that rely on robust objective evaluation to fusion evaluation [3, 4]. These metrics explicitly ignore
adapt the parameters of the fusion algorithm to current local structure but despite this apparent shortcoming,
conditions and inputs in order to achieve optimal fusion when considering reasonable fusion algorithms (that aim
robustness [1]. This is particularly applicable to real-time to preserve spatial structure of the inputs), they can
video fusion where input conditions may change consi- achieve high levels of evaluation accuracy [4]. Mutual
derably over long periods while costs of parameter opti- information has also been the basis for the most signifi-
mization can be spread across a number of frames [1]. In cant sequence fusion evaluation metric proposed so far.
this context robustness of performance evaluation is also Its aim was to measure the effects of various decomposi-
critical and may not be sufficiently provided by existing tion approaches on the temporal stability and overall
still fusion evaluation metrics. Performance evaluation of quality of a fused sequence of images. The metric is eva-
still image fusion has been studied relatively extensively luated by considering differential entropies of joint va-
in the past with a number of algorithms published in the riables constructed from the inter-frame differences of
literature. They are clustered around a number of key input images and the fused image.
ideas. The most natural approach is the concept of subjec-
tive fusion evaluation where representative audiences of
2 FEATURE BASED VIDEO FUSION
observers are asked to perform some tasks with or simply
view and evaluate fused imagery. As discussed earlier, The fusion process can be performed at different levels of
the main drawback of such trials however, is that they abstractions: pixel, feature and symbol level [5]. In image
require complex display equipment and an organisation fusion at pixel level, the intensity at each pixel in the
of an audience making them highly impractical. Hence fused image is determined from a set of pixels from each
objective fusion metrics that require no display equip- input source [6]. Fusion at higher level of abstraction re-
ment or audience have emerged. They require no ground quires the extraction of various features contained in the
truth data and produce a single numerical score reflecting input sources. Typical features are edges or regions ex-
fusion performance based entirely on the analysis of the tracted by appropriate segmentation procedures. We im-
inputs and the fused image. They can be realised compu- plemented the most primitive pixel-based video fusion
tationally in full, making them suitable for demanding techniques. Four frames of IR video [Fig. 1] and four RGB
video fusion evaluation. One such evaluation approach is frames [Fig. 2] have been fused, frame by frame. Figure
based on the Universal Image Quality Index [2] where 3(a-d) shows the fused images obtained using average,
local image statistics are used to define a similarity be- block fusion, maximum and wavelet based techniques.
Additive and multiplicative noise is an unwanted com-
———————————————— ponent of videos. They can occur as Gaussian noise or
 Anjali Malviya is working as Asst. Professor with the Dept. of IT, TSEC, film grain noise and may have undesirable effects on sur-
University of Mumbai, India. veillance applications. Improved performance can be
• S. G. Bhirud is working as Asstt. Professor with the Dept. of Computer achieved by considering a spatio-temporal filtering of the
Engineering, VJTI, Mumbai, India.

2010 Journal of Computing Press, NY, USA, ISSN 2151-9617


http://sites.google.com/site/journalofcomputing/
JOURNAL OF COMPUTING, VOLUME 2, ISSUE 12, DECEMBER 2010, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 125

video frames. We implement a feature-level video fusion


where the object (pedestrian in this case) is segmented
from the background, using the thermal-image features of
pedestrian. We use area feature of bright region to detect
pedestrian. If area of the bright region is greater than a
threshold value and height/width ratio is in the previous-
ly-established range, then the region is regarded as a pe-
destrian else it is regarded as noise region. The fusion where, , and , are the output of the hori-
results obtained for four frames of the “Dublin Sequence” zontal and vertical Sobel templates centered on pixel
are shown in Figure 4. , and convolved with the corresponding pixels of
image A. The relative strength and orientation values of
, and α , of an input image A with re-
3 Performance Evaluation of Video Fusion spect to F are formed.
3.1 Methodology These are used to derive the edge strength and orienta-
The goal in pixel level image fusion is to combine and tion preservation values. , and α , model
prserve in a single output image all the important visual the perceptual loss of information in F, in terms of how
information that is present in a number of input images. well the strength and orientation values of a pixel p(n,m)
Thus an objective fusion measure should extract all the in A are represented in the fused image. The algorithm
perceptually important information that exists in the in- involves the following steps.
put images and measure the ability of the fusion process
to transfer as accurately as possible this information into
the output image. A measure for objectively assessing the
pixel level fusion performance is the Xydeas-Petrovic ap-
proach [7]. We evaluate the results obtained from fusion
of Dublin video sequence, using the quality index sug-
gested by Xydeas-Petrovic and accordingly compare the
image fusion schemes. This measure is based upon the
theory that the human visual system resolves uncertainty
variations or edges rather than actual signal values. The
measure thus uses only edge information and not region-
al information.
The metric reflects the quality of visual information ob-
tained from the fusion of input images and can be used to
compare the performance of different image fusion algo-
rithms. Experimental results clearly indicate that this me- Fig. 1. Four IR frames from the IR video
tric is perceptually meaningful. This index is based on the
observation that the human visual system is particularly
sensitive to the edges in the image. Therefore, the perfor-
mance of the fusion method is evaluated as the quantity
of information associated to the edges which is trans-
ferred from the input images to the fused one. First, the
edge information is extracted from the input images;
then, the edge strength and orientation are calculated.
These features are subsequently used for the index evalu-
ation.

3.2 Implementation
Consider two input images A and B, and a resulting fused
image F. Note that the following methodology can be eas-
ily applied to more than two input images. A Sobel edge
operator is applied to yield the edge strength g(n,m) and
orientation (n,m) information for each pixel p(n,m), 1 ≤
n ≤ N and 1 ≤ m ≤ M. Thus for an input image A:

Fig. 2. Four RGB frames from the visible video

2010 Journal of Computing Press, NY, USA, ISSN 2151-9617


http://sites.google.com/site/journalofcomputing/
JOURNAL OF COMPUTING, VOLUME 2, ISSUE 12, DECEMBER 2010, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 126

C. Edge Information Preservation Values

After we get QAF(n,m) and QBF(n,m) from the input images


/
A and B a weighted performance metric is calcu-
lated as follows,

Fig. 3. Fusion using (a) Average (b) Block (c) Maxima (d) Wavelet

where, L is a constant.

3.3 Results
Xydeas-Petrovic is an index that measures the quality of
the object introduced in the final result. The experimental
results of the performance measurement of this metric are
shown to be in agreement with preference scores ob-
tained from informal subjective tests. Furthermore, this
clearly indicates that the Xydeas-Petrovic fusion measure
is perceptually meaningful.
TABLE I
PERFORMANCE RESULTS FOR THE DIFFERENT FU-
SION TECHNIQUES

Fusion Technique Xydeas-Petrovic Index


Fig. 4. Four Frames from the feature-based fused video

Simple Average 0.38916


A. Derive Edge Strength and Orientation
Run the Sobel mask for x and y gradients on the input
image and find the edge strength and orientation using Simple Block 0.43236
(1) and (2).

B. Relative Strength and Orientation Values Simple Maximum 0.43218

Wavelet 0.49374

Feature Level Fusion 0.5438

2010 Journal of Computing Press, NY, USA, ISSN 2151-9617


http://sites.google.com/site/journalofcomputing/
JOURNAL OF COMPUTING, VOLUME 2, ISSUE 12, DECEMBER 2010, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 127

An analysis based on the evaluation of objective indexes


is presented in Table I. The Table reports the average val-
ue of the Xydeas-Petrovic index computed over the entire
sequence (500 frames). The evaluated quality indices are
consistent with the visual perception, demonstrating the
best performance of the feature-level method involving
object extraction using segmentation.

REFERENCES
[1] V Petrović, T. Cootes, Objectively adaptive image fusion, Information
Fusion, Vol. 8(2), Elsevier, 2007, 168- 176
[2] N Cvejić, D Bull, C Canegarajah, A New Metric for Multimodal
Image Sensor Fusion, Electronics Letters, Vol. 43(2), 95-96, IEE,
2007
[3] G Qu, D Zhang, P Yan, Information measure for performance of
image fusion, Electronics Letters, Vol. 38(7), 313-315, IEE, 2002
[4] V Petrović, T Cootes, Information Representation for Image Fusion
Evaluation, Proceedings of Fusion 2006, Florence, ISIF, July 2006
[5] C. Pohl, J. L. Genderen, “Multisensor Image Fusion in Remote Sens-
ing: Concepts, Methods and Applications”, International Journal of
Remote Sensing, vol. 19, no. 5, pp. 823-854, 1998.
[6] G. Corsini, M. Diani, A. Masini, M. Cavallini, “Enhancement of
Sight Effectiveness by Dual Infrared System: Evaluation of Image Fu-
sion Strategies”, ICTA’05
[7] C Xydeas, V Petrović, “Objective Pixel-level Image Fusion Perfor-
mance Measure”, Proc. of SPIE, Vol. 4051, April 2000, pp 89-99

2010 Journal of Computing Press, NY, USA, ISSN 2151-9617


http://sites.google.com/site/journalofcomputing/

You might also like