You are on page 1of 13

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/228369088

A multiscale approach to pixel-level image fusion

Article  in  Integrated Computer Aided Engineering · April 2005


DOI: 10.3233/ICA-2005-12201

CITATIONS READS

90 569

4 authors, including:

Hamid Krim
North Carolina State University
323 PUBLICATIONS   8,256 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Data Readiness Level View project

Topological Data Analysis On Signals and Networks View project

All content following this page was uploaded by Hamid Krim on 26 May 2014.

The user has requested enhancement of the downloaded file.


Integrated Computer-Aided Engineering 12 (2005) 135–146 135
IOS Press

A multiscale approach to pixel-level image


fusion
A. Ben Hamzaa , Yun Heb , Hamid Krimc,∗ and Alan Willskyd
a
Concordia Institute for Information Systems Engineering, Concordia University, Montr éal, Quebec, Canada H3G
1T7
E-mail: hamza@ciise.concordia.ca
b
Tality Corporation, Cary, NC 27511, USA
E-mail: yhe@tality.com
c
Department of Electrical and Computer Engineering, North Carolina State University, Raleigh, NC 27695-7914,
USA
E-mail: ahk@ncsu.edu
d
Laboratory for Information and Decision Systems, Massachusetts Institute of Technology, Cambridge, MA
02139-4307, USA
E-mail: willsky@mit.edu

Abstract. Pixel level image fusion refers to the processing and synergistic combination of information gathered by various imaging
sources to provide a better understanding of a scene. We formulate the image fusion as an optimization problem and propose an
information theoretic approach in a multiscale framework to obtain its solution. A biorthogonal wavelet transform of each source
image is first calculated, and a new Jensen-Rényi divergence-based fusion algorithm is developed to construct composite wavelet
coefficients according to the measurement of the information patterns inherent in the source images. Experimental results on
fusion of multi-sensor navigation images, multi-focus optical images, multi-modality medical images and multi-spectral remote
sensing images are presented to illustrate the proposed fusion scheme.

1. Introduction (LLTV) obtained by an airborne sensor platform would


aid a pilot to navigate in poor weather conditions or
The rapid development and deployment of new imag- darkness.
ing sensors underline the urgency for novel image pro- The actual fusion process can take place at different
cessing techniques that can effectively fuse images levels of information representation. A generic cate-
from different sensors into a single composite for in- gorization is to consider a process at signal, pixel, or
terpretation. Fusion typically begins with two or more feature and symbolic levels [12]. We focus on the so-
registered images with different representations of the called pixel level fusion process, where a composite
same scene. They may come from different viewing image has to be synthesized from several input images.
conditions, or even different sensors. Image fusion of Some generic requirements must be imposed on the fu-
multiple sensors in a vision system could significantly sion result. The fusion process should preserve all rele-
reduce human/machine error in detection and recogni- vant information of the input imagery in the composite
tion of objects thanks to the inherent redundancy and image (pattern conservation). In particular, that fusion
extended coverage. For example, fusion of forward scheme should not introduce any artifacts or inconsis-
looking infrared (FLIR) and low light television images tencies which would distract a human observer or the
following processing stages (i.e. causality).
Over the past two decades, a wide variety of pixel-
∗ Corresponding author. level image fusion algorithms have been developed.

ISSN 1069-2509/05/$17.00  2005 – IOS Press and the author(s). All rights reserved
136 A. Ben Hamza et al. / A multiscale approach to pixel-level image fusion

DWT

Fusion
decision
map

DWT Fused wavelet Fused image


coefficients
Fusion rules

Registered source images Wavelet coefficients

Fig. 1. A general framework for multiscale fusion with wavelet transform.

These techniques can be classified into linear superpo- based on the orthogonal wavelet transform [10,19,16].
sition, logical filter [1], mathematical morphology [9], The wavelet transform success may be credited to cer-
image algebra [5,15], artificial neural network [7], and tain advantages it offers over the Laplacian pyramid-
simulated annealing methods [18]. All of these algo- based techniques. The wavelet bases may be chosen
rithms aimed at making fused image reveal new infor- to be orthogonal, making the information gleaned at
mation concerning features that can not be perceived different resolution unique. The pyramid decomposi-
in individual sensor images. Some useful information tion, on the other hand, contains redundancy between
has, however, been discarded since each fusion scheme different scales. Furthermore, a wavelet image repre-
tends more heavily to favor some attributes of the image sentation provides directional information in the high-
over others. A detailed overview of these techniques is low, low-high and high-high bands, while the pyra-
given in [12]. mid representation fails to introduce any spatial orien-
Inspired by the fact that the human visual system tation selectivity into the decomposition process. A
processes and analyzes image information at differ- major limitation in all recent wavelet-based fusion al-
ent scales, researchers have recently proposed a mul- gorithms, however, is the absence of a good fusion cri-
tiscale based fusion method and this has been widely terion. Most existing selection rules are to a large ex-
accepted [4] as one of the most effective techniques for tent similar to “choose max”. This in turn induces a
image fusion. A multiscale transform which may be in significant amount of high frequency noise introduced
a form of a pyramid or wavelet transform, is first cal- by a systematic and sudden inclusion of the fused max-
culated for each input image, and a composite is then imal wavelet coefficient of a given source. This is par-
formed by selecting the coefficients from the multiscale ticularly problematic, knowing the highly undesirable
representations of all source images. A fused image is perception of high frequency noise by human vision.
finally recovered through an inverse transformation. In In this paper, and as we explain below, we apply a
Burt’s pioneering work [2], a Laplacian pyramid and a biorthogonal wavelet transform in carrying out a pixel
“choose max” rule is proposed as a model for binocular level image fusion. It is well known that smooth
fusion in human stereo vision. For each coefficient in biorthogonal wavelets of compact support may be ei-
the pyramids of source images, the one with the max- ther symmetric or antisymmetric, unlike orthogonal
imum amplitude is copied to the composite pyramid wavelets, with the exception of the Haar wavelet. Sym-
which serves to reconstruct a fused image by its inverse metric or antisymmetric wavelets are synthesized with
pyramid transform. More recently, fusion within a gra- perfect reconstruction filters with a linear phase. This is
dient pyramid was shown to provide improved stability also a very desirable property for image fusion applica-
and noise immunity [3]. tions. In lieu of the “choose max” type of selection rule,
Wavelet theory has played a particularly important we propose an information theoretic fusion scheme. To
role in multiscale analysis (see [8,17] and the references each pixel in a source image is associated a vector con-
therein). A general framework for multiscale fusion sisting of wavelet coefficients at that position across
with wavelet transform is shown in Fig. 1. A number scales. This is in turn is used to reflect the “activity”
of papers have proposed fusion algorithms which are of that pixel. We refer to these indicator vectors of all
A. Ben Hamza et al. / A multiscale approach to pixel-level image fusion 137

Rows Columns

2 2

2 2

(a)

Columns Rows

2 2

2 2

(b)

Fig. 2. A fast biorthogonal two-dimensional wavelet transform (a) and its inverse transform (b) implemented by perfect reconstruction filter
banks.

the pixels in a source image as its activity map. A deci- fine to coarse scales are considered to assess the ac-
sion map is then obtained by applying an information tivity at a particular position within an image, our ap-
theoretic divergence to measure all the source activity proach will clearly be more accurate, in the sense of
maps. To objectively perform a comparison between selecting coefficients containing rich information than
activity indicator vectors, we propose a recently devel- “select max” type of fusion schemes.
oped Jensen-Rényi divergence divergence measure [6], The remainder of this paper is organized as follows.
which is defined in terms of R ényi entropy [14]. The In the next section and for completeness, we briefly
wavelet coefficients of the fused image are in the end review a biorthognal wavelet representation of an im-
selected according to the decision map. Since all the age. A concise formulation of the problem is given in
138 A. Ben Hamza et al. / A multiscale approach to pixel-level image fusion

α=0
α=0.2
α=0.5
α=0.7
Shannon
Rα(p)

0 p 1

Fig. 3. Rényi’s entropy.


 
Section 3. In Section 4, we propose an information k 1 t1 − 2 j n 1 t2 − 2 j n 2
ψ̃j, n (t) = j ψ̃ k , . (5)
theoretic fusion algorithm using the Jensen-R ényi di- ¯ 2 2j 2j
vergence, and present some substantiating numerical It is easy to verify [13] that
experiments. We finally provide concluding remarks
in Section 5. {ψj,
1
n , ψj,n , ψj,n }(j,n)∈Z3
2 3

and
2. Background on biorthogonal wavelet {ψ̃j,
1
n , ψ̃j,n , ψ̃j,n }(j,n)∈Z3
2 3

representation of images form biorthogonal bases of L 2 (R2 ).


Any f ∈ L2 (R2 ) has two possible decompositions
Let {ϕ, ϕ̃} and {ψ, ψ̃} be two dual pairs of scal- in these basis,
ing functions and wavelets that generate biorthogonal
wavelet basis of L2 (R2 ). For t = (t1 , t2 ) ∈ R2 and 
3
k k
f= < f, ψj, n > ψ̃j,n
n = (n1 , n2 ) ∈ Z2 , we write
j n k=1
(6)
ϕ2j,n (t) = ϕj,n1 (t1 )ϕj,n2 (t2 ), (1) 
3
k k
= < f, ψ̃j, n > ψj,n .
ψ (t) = ϕ(t1 )ψ(t2 ), ψ (t) = ψ(t1 )ϕ(t2 ),
1 2 j n k=1
(2) k
ψ 3 (t) = ψ(t1 )ψ(t2 ), Assuming we choose ψ j, n as the analysis wavelet, at
j
any scale 2 , we denote the approximation coefficient
ψ̃ 1 (t) = ϕ̃(t1 )ψ̃(t2 ), ψ̃ 2 (t) = ψ̃(t1 )ϕ̃(t2 ), by
(3)
ψ̃ 3 (t) = ψ̃(t1 )ψ̃(t2 ). aj [n] =< f, ϕ2j,n >

For 1  k  3, we may write out and the wavelet coefficient by


 
k 1 k t1 − 2 j n 1 t2 − 2 j n 2 dkj [n] =< f, ψj,
k
n >, k = 1, 2, 3.
ψj, n (t) = ψ , (4)
2j 2j 2j The three wavelets ψ k extract image details at differ-
and ent scales and orientations. Over positive frequencies,
A. Ben Hamza et al. / A multiscale approach to pixel-level image fusion 139

q
JRα

q p
p

q
JRα

q p
p
Fig. 4. 3D and contour plots of Jensen-Rényi divergence with equal weights. Top: α ∈ (0, 1). Bottom: α > 1.

ϕ̂ and ψ̂ have an energy mainly concentrated respec- cal frequencies ω 2 , whereas |ψ̂ 3 (ω)| is larger at both
tively on lower and higher frequencies. high horizontal frequencies ω 1 and high vertical fre-
For ω = (ω1 , ω2 ), the separable wavelet expressions quencies ω2 . As a result, wavelet coefficients calcu-
imply that lated with ψ 1 and ψ 2 are larger along edges which are
respectively horizontal and vertical, and ψ 3 produces
ψ̂ 1 (ω) = ϕ̂(ω1 )ψ̂(ω2 ), ψ̂ 2 (ω) = ψ̂(ω1 )ϕ̂(ω2 ), large coefficients at the corners.
(7)
ψ̂ 3 (ω) = ψ̂(ω1 )ψ̂(ω2 ). The wavelet coefficients at scale 2 j+1 are calcu-
lated from aj with two dimensional separable convo-
Hence |ψ̂ 1 (ω)| is larger at low horizontal frequen- lutions and subsamplings. Let {h, g} and { h̃, g̃} be
cies ω1 and high vertical frequencies ω 2 , |ψ̂ 2 (ω)| is the perfect reconstruction filters associated with the
larger at high horizontal frequencies ω 1 and low verti- biorthogonal wavelet {ψ, ψ̃}. For any pair of one-
140 A. Ben Hamza et al. / A multiscale approach to pixel-level image fusion

dimensional filters y[n] and z[n], we write the product 3. Pre-fusion processing and problem formulation
filter yz[n] = y[n1 ]z[n2 ], and denote ȳ[n] = y[−n].
For n = (n1 , n2 ), Let f1 , f2 , . . . , fm : Z2 → R be m digital images of
the same scene taken from different sensors. For the
aj+1 [n] = aj  h̄h̄[n], (8)
pixel level image fusion problem, we assume all the
source images are registered so that the difference in
d1j+1 [n] = aj  h̄ḡ[n], (9)
resolution, coverage, treatment of a theme, characteris-
tics of image acquisition methods are deferred to other
d2j+1 [n] = aj  ḡ h̄[n], (10) sources. The goal of our fusion algorithm is to con-
struct a composite image with all crucial information
d3j+1 [n] = aj  ḡḡ[n]. (11) captured from all the source images are combined to
A separable two dimensional convolution may be effectively achieve a compressed version of all source
factored into one-dimensional convolutions along rows image data. To this end, we exploit an information the-
and columns of the image. The factorization is illus- oretic fusion approach based on a biorthogonal wavelet
trated in Fig. 2(a). The rows of a j are first convolved representation.
with h̄ and ḡ, and subsampled by 2. The columns of
these two output images are then convolved respec- Definition 1. Let W fi = {d1i (j, n), d2i (j, n), d3i (j, n),
tively with h̄ and ḡ and subsampled to result in four ai (L, n)}0<jL,n∈Z2 be a biorthogonal wavelet im-
subsampled images a j+1 , d1j+1 , d2j+1 and d3j+1 . age representation of f i as defined in Eq. (13). Without
We denote loss of generality, we set J = 0. For any n ∈ Z 2 , an
activity pattern vector is defined as
y̆[n] = y̆[n1 , n2 ]  3
  
3
y[k1 , k2 ] if (n1 , n2 ) = (2k1 , 2k2 ) Ai (n) = |dki (1, 2L−1 n)|2 , |dki
=
0 otherwise. k=1 k=1
(14)
The image is obtained by inserting a row of zeros and 
3
L−2
(2, 2 n)| , . . . ,
2
|dki (L, n)|2 ,
a column of zeros between pairs of consecutive rows k=1
and columns of y[n 1 , n2 ]. And aj is recovered from
the coarser scale approximation a j+1 and the wavelet which is a (1 × L) vector of energy concentrated at
coefficients d1j+1 , d2j+1 and d3j+1 with two-dimensional pixel fi (2L n) across scales. We refer to {Ai (n)}n∈Z2
separable convolutions as the activity map of source image f i .
Activity maps highlight the inherent information pat-
aj [n] = ăj+1  h̃h̃[n] + d˘1j+1  h̃g̃[n] + d˘2j+1 tern in source images. To proceed with a scale-driven
(12)
g̃h̃[n] + d˘3j+1  g̃g̃[n]. fusion of the source wavelet coefficients, it is crucial to
compare the activity patterns for every pixel. If for in-
These four convolutions can also be factored into stance the activity patterns are different in some region,
six groups of one-dimensional convolutions along rows an averaging process of wavelet coefficients for a com-
and columns, as illustrated in Fig. 2(b). posite output is unlikely to be a good choice as it yields
Let aJ [n] be a digital image whose pixel interval artifacts. On the other hand, if the activity patterns
equals 2J = N −1 . We associate to aJ [n] a function are similar in a region, an averaging procedure would
f (x) ∈ R2 approximated at the scale 2 J , increase information to the fused image on account of
1 an enhanced contribution from different sources.
aJ [n] =< f, ϕ2J,n >≈ f (N −1 n). A reasonable measure for activity patterns should
N
satisfy the following properties:
A biorthogonal wavelet image representation of a J
of depth L − J is computed by iterating Eq. (12) for – It provides a quantitative difference measure be-
J < j  L: tween two or more activity patterns.
– It is nonnegative and symmetric.
{d1j , d2j , d3j , aL }J<jL (13)
– It vanishes (to zero) if and only if the activity
The original digital image a J is recovered from this patterns are exactly the same.
wavelet representation by iterating the reconstruction – It reaches the maximum value when activity pat-
Eq. (12) for J < j  L. terns are degenerate distributions.
A. Ben Hamza et al. / A multiscale approach to pixel-level image fusion 141

(1) (2)

(3) (4)

(5) (6)

Fig. 5. Multi-sensor image fusion: (1) a low-light-television sensor image; (2) a forward-looking-infrared image; (3) fusion by averaging; (4)
fusion by wavelet based maximum selection scheme; (5) a selection map; (6) fusion by the proposed information theoretic approach.

In the next section, we propose an information- min(W fi (j, n))  W f (j, n)  max(W fi (j, n)),
theoretic measure called the Jensen-R ényi divergence,
for any 0 < j  L and n ∈ Z 2 . This constraint
which satisfies the above requirements. A decision map
ensures that the solution does not yield features beyond
is then generated by applying the Jensen-R ényi diver-
the scope of the source images (i.e. artifacts).
gence to measure the coherence of source activity maps
at the pixel level. We further segment the decision map
into two regions, D 0 and D1 . The region D 0 is the
4. Information-theoretic image fusion
set of pixels whose activity patterns are similar in all
the source images, while the region D 1 is the set of 4.1. Jensen-Rényi divergence measure
pixels whose activity patterns are different. Our fusion
technique is the solution to the following optimization Let k ∈ N and X = {x1 , x2 , . . . , xk } be a finite
problem. set with a probability distribution p = (p 1 , p2 , . . . , pk ),
 m 

L  i.e. kj=1 pj = 1 and pj = P (X = xj )  0, where


W f = arg min |W f (j, n) P (·) denotes the probability.
f ∈F

i=1 j=1 2j n∈D0


(15) Shannon’s entropy is defined as H(p) = − kj=1 pj
 L 
log(pj ), and it is a measure of uncertainty, dispersion,
−W fi (j, n)| −
2
|W f (j, n)| ,
2
information, and randomness. The maximum uncer-
j=1 2j n∈D1 tainty or equivalently minimum information is achieved
where F is the set of all the images f whose wavelet by the uniform distribution. Hence, we can think of
transform satisfies entropy as a measure of uniformity of a probability dis-
142 A. Ben Hamza et al. / A multiscale approach to pixel-level image fusion

(1) (2)

(3) (4)

(5) (6)

Fig. 6. Multi-focus image fusion: (1) an image focused on the larger clock; (2) an image focused on the smaller clock; (3) fusion by averaging;
(4) fusion by wavelet based maximum selection scheme; (5) a selection map; (6) fusion by the proposed information theoretic approach.

tribution. Consequently, when uncertainty is higher, Figure 3 depicts Rényi entropy for a Bernoulli dis-
the degree of difficulty to predict the outcome of a draw tribution p = (p, 1 − p), with different values of the
from a probability distribution increases. A general- parameter α. As illustrated in Fig. 3, the measure of
ization of Shannon entropy is R ényi entropy [14] given uncertainty is at a minimum when Shannon entropy is
by used, and it increases as the parameter α decreases.
k
Rényi entropy attains a maximum uncertainty when its
1  exponential order α is equal to zero. Note that smaller
Rα (p) = log pα
j,
1−α j=1
values of α tend to emphasize probability tails.
(16)
α ∈ (0, 1) ∪ (1, ∞). Definition 2. Let p1 , p2 , . . . , pn be n probability dis-
For α > 1, the Rényi entropy is neither concave nor tributions. The Jensen-R ényi divergence is defined as
n
convex. 
For α ∈ (0, 1), it is easy to see that Rényi entropy JRω (p , . . . , p ) = R
α 1 n α ωp i i
is concave, and tends to Shannon entropy H(p) as i=1
α → 1. One may easily verify that R α is a non- n

increasing function of α, and hence − ωi Rα (pi ),
i=1
Rα (p)  H(p), ∀α ∈ (0, 1). (17)
where Rα (p) is Rényi’s entropy, and

ω = (ω 1 , ω2 , . . . ,
n
When α → 0, Rényi entropy is equal to the logarithm ωn ) be a weight vector such that i=1 ωi = 1 and
of the cardinality of the set {j ∈ [1, k] : p j > 0}. ωi  0.
A. Ben Hamza et al. / A multiscale approach to pixel-level image fusion 143

(1) (2)

(3) (4)

(5) (6)

Fig. 7. Multi-modality image fusion: (1) a CT image; (2) a MRI image; (3) fusion by averaging; (4) fusion by wavelet based maximum selection
scheme; (5) a selection map; (6) fusion by the proposed information theoretic approach.

Using the Jensen inequality, it is easy to check that disparity among probability distributions, as recently
the Jensen-Rényi divergence is nonnegative for α ∈ demonstrated in registering Inverse Synthetic Aperture
(0, 1). It is also symmetric and vanishes if and only if Radar (ISAR) images [6].
the probability distributions p 1 , p2 , . . . , pn are equal,
for all α > 0. 4.2. Image fusion with Jensen-R ényi divergence
Note that the Jensen-Shannon divergence [11] is
a limiting case of the Jensen-Rényi divergence when
As noted earlier, our primary goal in image fusion
α → 1.
is to integrate complementary information from multi-
Unlike other entropy-based divergence measures
sensor data with resulting fused images offering richer
such as the Kullback-Leibler divergence, the Jensen-
features and a better human visual perception.
Rényi divergence has the advantage of being symmet-
Let f1 , f2 , · · · , fm : Z2 → R be m digital im-
ric and generalizable to any arbitrary number of prob-
ages generated by different sensors. Our information-
ability distributions or data sets, with a the additional
theoretic fusion approach first calculates a biorthogo-
flexibility of assigning weights to these distributions
nal wavelet image representation for each f i , then a
(i.e. prior beliefs). Figure 4 shows three-dimensional
pixel level activity map {A i (n)}n∈Z2 is formed, as
representations and contour plots of the Jensen-R ényi
described in Section 3. Denote by  ·  the  1 norm, and
divergence with equal weights between two Bernoulli
for any n ∈ Z2 we define a normalized activity pattern
distributions for α ∈ (0, 1) and also for α ∈ (1, ∞). 
In addition to its convexity property, the Jensen- Ai (n)/Ai (n) if Ai (n) = 0
pi (n) =
Rényi divergence is shown to be an adapted measure of ∆1 if Ai (n) = 0,
144 A. Ben Hamza et al. / A multiscale approach to pixel-level image fusion

(1) (2)

(3) (4)

(5) (6)

Fig. 8. Multi-spectral image fusion: (1) a high resolution remote sensing image; (2) a low resolution remote sensing image; (3) fusion by
averaging; (4) fusion by wavelet based maximum selection scheme; (5) a selection map; (6) fusion by the proposed information theoretic approach.

where ∆1 = [1, 0, · · · , 0] is a (1 × L) degenerate dis- defined as



n
tribution. To fuse the source wavelet coefficients, we (1/n) i=1 aki (L, n) if 2L n ∈ D0
compare the normalized activity patterns of all the akf (L, n) =
max(aki (L, n)) if 2L n ∈ D1
source images in terms of Jensen-R ényi divergence,
and create a selection map {S(n)} n∈Z2 : and for k = 1, 2, 3,

n
S(n) = JRω (p (n), · · · , p (n)) .
α 1 n (18) dkf (j, n) =
(1/n) i=1 dki (j, n) if 2j n ∈ D0
max(dki (j, n)) if 2j n ∈ D1
The selection map is further segmented into two de-
cision regions, D 0 and D1 . Setting T to be the mean It can be verified that the fused image f is the solution
value of a selection map, we write to the optimization criteria Eq. (15).
In what follows, we present four examples, includ-
D0 = {n ∈ Z2 : S( 2−L n
) < T } ing multi-sensor navigation image fusion, multi-focus
and optical image fusion, multi-modality medical image fu-
sion and multi-spectral remote sensing image fusion to
D1 = {n ∈ Z2 : S( 2−L n
)  T }
illustrate the fusion scheme proposed above.
where x
denotes the integer part of x.
Let f be a composite image with its wavelet coeffi- 4.3. Multi-sensor image fusion
cients
W f = {d1f (j, n), d2F (j, n), To help helicopter pilots navigate under poor visibil-
ity conditions, such as fog or heavy rain, helicopters
d3f (j, n), af (L, n)}0<jL,n∈Z2 are equipped with several imaging sensors, which are
A. Ben Hamza et al. / A multiscale approach to pixel-level image fusion 145

accessible to the pilot through a helmet mounted dis- 4.6. Multi-spectral image fusion
play. A typical sensor suite includes both a low-
light-television (LLTV) sensor and a thermal imaging Image fusion is often of interest in Remote sens-
forward-looking-infrared (FLIR) sensor. In the current ing applications: modern spectral sensors include up
configuration, the pilot can only choose one of the two to several hundred spectral bands which may be either
sensors to watch on his display. Sample LLTV and processed and visualized individually, or may be fused
FLIR images are shown in Figs 5(1) and 5(2) respec- into a single image, depending on the image analysis
tively. A possible improvement is to combine both task. Image fusion of two bands from a multispec-
imaging sources into a single fused image. tral sensor with our multiscale information-theoretic
Image fusion by standard techniques such as pixel approach is illustrated in Fig. 8. For comparison pur-
averaging and multiscale based maximum selection pose, fusion by pixel averaging and multiscale based
scheme are shown in Figs 3(3) and 3(4) respectively. maximum selection scheme are shown in Figs 8(3) and
Note that the pixel averaging method has a “muddy” 8(4).
appearance. This is due primarily to the fact that aver-
aging results in reduced contrast areas for all the pat-
terns that appear in only one source. On the other hand, 5. Conclusion
the maximum selection scheme produces some mosaic
like artifacts due to the high frequency noise intro- In this paper, we proposed a new multiscale image
duced by sudden switches between two sets of source fusion algorithm aimed at integrating complementary
wavelet coefficients. Image fusion with our proposed information from multi-sensor data so that the fused
multiscale information-theoretic approach is illustrated images are enhanced and more human perception-
in Fig. 5(6). As may be seen, all the significant fea- friendly. We formulate the image fusion as an opti-
tures from both sources are retained in the fused image mization problem whose solution is achieved by the
without the artifacts of previous approaches. proposed method.
As a first step, a biorthogonal wavelet transform
4.4. Multi-focus image fusion of each source image is calculated to yield a scale
space representation. Compared to most orthogonal
Due to the limited depth-of-focus of optical lenses, it wavelets, biorthogonal wavelets may be synthesized by
is often not possible to get an image which contains all perfect reconstruction filters with a linear phase. This
relevant objects ‘in focus’. One possibility to overcome is a desirable property for image fusion applications.
this problem is to take several pictures with different Unlike the “choose max” type of selection rules, our
focus points and to combine them into a single frame proposed technique is based on an information theo-
which finally includes the focused regions of all input retic fusion measure. Since all the scales, from fine to
images. Figure 6 illustrates our multiscale information- coarse, are considered in assessing the activity at a par-
theoretic fusion approach. For comparison purpose, fu- ticular position within an image, our approach is more
sion by pixel averaging and multiscale based maximum accurate, as the selected source coefficients yield richer
selection scheme are shown in Figs 6(3) and 6(4). information.
We have successfully tested the new technique on fu-
4.5. Multi-modality image fusion sion of multi-sensor (low-light-television and forward-
looking-infrared), multi-focus, multi-modality (CT and
With the development of new imaging methods MRI), and multi-spectral images. The presented algo-
and increasing challenges in medical applications, rithm clearly outperforms current wavelet based fusion
arises the demand for meaningful and spatial correct methods in preserving significant features from a vari-
combination of all available image datasets. Exam- ety of sources without the common shortcomings such
ples for imaging devices include computer tomogra- as artifacts.
phy (CT), magnetic resonance imaging (MRI) or the
newer positron emission tomography (PET). Our mul-
tiscale information-theoretic approach is illustrated in References
Fig. 7(6). For comparison purpose, fusion by pixel
[1] P. Ajjimarangsee and T.L. Huntsberger, Neural network model
averaging and multiscale based maximum selection for fusion of visible and infrared sensor outputs, Proc. SPIE,
scheme are shown in Figs 7(3) and 7(4). 1988, pp. 153–160.
146 A. Ben Hamza et al. / A multiscale approach to pixel-level image fusion

[2]P.J. Burt, The Pyramid as a Structure for Efficient Com- Image Processing (57) (1995), 235–245.
putation. Multiresolution image processing and analysis, A. [11] J. Lin, Divergence measures based on the shannon entropy,
Rosenfeld, ed., Springer-Verlag, New York, 1984. IEEE Trans. Information Theory (37) (1991), 145–151.
[3] P.J. Burt, A gradient pyramid basis for pattern selective fusion, [12] R. Luo and M. Kay, Data fusion and sensor integration: state
Proc. of the Society for Information Display Conference, 1992. of the art in 1990s. Data fusion in robotics and machine intel-
[4] P.J. Burt and R.J. Kolczynski, Enhanced image capture ligence, M. Abidi and R. Gonzalez, eds, Academic Press, San
through fusion, Proc. 4th Intl. Conference on Computer Vi- Diego, 1992.
sion, 1993, pp. 173–182. [13] S. Mallat, A Wavelet Tour of Signal Processing, Academic
[5] S. Chen, Stochastic image algebra for multi-sensor fusion Press, Academic Press Limited, London, UK, 1997.
and spatial reasoning: a neural approach, Proc. SPIE, 1989, [14] A. Rényi, On measures of entropy and information, Selected
pp. 146–154. Papers of Alfred Renyi, 1976, pp. 525–580.
[6] Y. He, A. Ben Hamza and H. Krim, A generalized divergence [15] G.X. Ritter, J.N. Wilson and J.L. Davidson, Image algebra ap-
measure for robust image registration, IEEE Trans. on Signal plication to multi-sensor and multi-data image manipulation,
Processing (51) (2003), 1211–1220. Proc. SPIE, 1988, pp. 2–7.
[7] T.L. Huntsberger, Data fusion: A neural networks implemen- [16] O. Rockinger, Pixel level fusion of image sequences using
tation. Data fusion in robotics and machine intelligence, M. wavelet frames, Proc. 16th Leeds Annual Statistical Research
Abidi and R. Gonzalez, eds, Academic Press, San Diego, Workshop, 1996, pp. 149–154.
1992. [17] A.S. Willsky, Multiresolution markov models for signal and
[8] H. Krim, W. Willinger, A. Juditski and D. Tse, Special issue image processing, Proceedings of the IEEE (90) (2002), 1396–
on multiscale statistical signal analysis and its applications, 1458.
IEEE Trans. Infor. Theory (45) (1999), 825–1062. [18] W.A. Wright, A markov random field approach to data fusion
[9] J.S. Lee, Multiple sensor fusion based on morphological pro- and color segmentation, Image Vision Comp (7) (1989), 144–
cessing, Proc. SPIE, 1988, pp. 94–100. 150.
[10] H. Li, B.S. Manjunath and S.K. Mitra, Multisensor image [19] D.A. Yocky, Artifacts in wavelet image merging, Optical Eng
fusion using the wavelet transform, Graphical Models and (35) (1996), 2094–2101.

View publication stats

You might also like