You are on page 1of 12

Optik 124 (2013) 40–51

Contents lists available at SciVerse ScienceDirect

Optik
journal homepage: www.elsevier.de/ijleo

Multi-focus image fusion based on nonsubsampled contourlet transform and


focused regions detection
Huafeng Li b,∗ , Yi Chai a,b , Zhaofei Li a
a
State Key Laboratory of Power Transmission Equipment and System Security and New Technology, Chongqing University, Chongqing 400044, PR China
b
College of Automation, Chongqing University, Chongqing 400044, PR China

a r t i c l e i n f o a b s t r a c t

Article history: A novel multi-focus image fusion algorithm is proposed in nonsubsampled contourlet transform (NSCT)
Received 15 June 2011 domain. It uses the technique of focused regions detection and some new fusion rules to guide coefficients
Accepted 5 November 2011 combination. At first, the initial fused image is acquired with one of conventional multiresolution image
fusion method; meanwhile the source multi-focus images are decomposed by using NSCT. The pixels
of the original images, which are more similar to the corresponding pixels of initial fused image, are
Keywords:
considered from the sharply focused regions. Based on this, the initial focused regions can be determined
Image fusion
and the morphological opening and closing are employed for postprocessing. Then the focused regions in
Nonsubsampled contourlet transform
(NSCT)
each image are identified and used to guide the different subband coefficients combination. Moreover, in
Morphological opening and closing order to avoid the erroneous results introduced at the boundary of the focused regions, the coefficients
Fusion rules located at the focused border regions are fused by the choice of corresponding subband coefficients
according to some new fusion rules. Finally, the fused image is reconstructed by performing the inverse
NSCT. The experimental results show that the proposed fusion approach is effective and can provide
better performance in fusing multi-focus images than some current methods.
© 2011 Elsevier GmbH. All rights reserved.

1. Introduction 2. The image fusion process should not yield any artificial informa-
tion such that the human observer or advanced image processing
Image fusion can be defined as the process by which a set tasks would be distracted.
of images is combined to produce a new image that integrates 3. The image fusion process should be capable of avoiding imper-
complementary, multi-temporal or multi-view information from fections such as mis-registration.
the sources [1]. The resultant image acquired from image fusion
technique is more informative and appropriate for the purposes
of human visual perception and computer-processing tasks such In this paper, we concentrate on multi-focus image fusion. Due
as segmentation, feature extraction and target recognition. Image to the limited depth-of-focus of optical lenses in CCD devices, it is
fusion technique is not only capable of providing a better interpre- often not possible to get an image that contains all relevant objects
tation, but also improves the reduction of data storage as well as the in focus. To obtain an image with every object in focus, an image
reliability of image content. Important applications of the fusion of fusion process is required to fuse the images taken from the same
images include biomedical imaging, marine sciences, microscopic view point under different focal settings.
imaging, remote sensing, computer vision, and robotics. Basically, there are two main methods for multi-focus image
When taking account of the objectives and benefits of image fusion. One is the spatial domain-based methods, which select pix-
fusion, the following requirements must be imposed [2,3]: els or regions from clear parts in the spatial domain to compose
fused images [4–9]. Another is the transform domain based meth-
ods, which fuse images with certain frequency or time–frequency
1. The image fusion process should retain all relevant and salient transform [10–17].
information of source images while discarding irrelevant parts The simplest fusion method in spatial domain is to take the
in the fused result. average of the source images pixel by pixel. However, along with
simplicity come several undesired side effects including reduced
contrast. To improve the quality of the fused image, some more rea-
sonable methods were proposed to fuse source images with divided
∗ Corresponding author. blocks or segmented regions instead of single pixels [4,5,7,9].
E-mail address: lhfaiww@126.com (H. Li). Most of these methods combine the blocks or regions according

0030-4026/$ – see front matter © 2011 Elsevier GmbH. All rights reserved.
doi:10.1016/j.ijleo.2011.11.088
H. Li et al. / Optik 124 (2013) 40–51 41

to measurement which evaluate the part is clear or not. Another coefficients are selected out from focused regions but not com-
important spatial domain-based method is to identify the focused pletely.
regions in each image [6,8], then these regions are fused into a single In order to remedy the disadvantages of the traditional MST-
image by simply copying them into the resultant image. However, based fusion methods, which cannot select the coefficients from the
for the block-based methods, the fused images often suffer from focused regions completely, and improve the fusion performance,
block effect, and the focused regions identify based (named as an effective multi-focus image fusion algorithm is proposed based
focused regions-based) methods may easily produce artificial infor- NSCT and focused regions detection technique. Firstly, the initial
mation or erroneous results at the focused border regions, because fused image is acquired with one of a conventional multiresolution
the boundary of focused regions often cannot be accurately deter- image fusion method. The pixels of the original images, which are
mined. All of these effects will affect the appearance of fused image similar to the corresponding pixels of initial fused image, are con-
a lot. sidered from the sharply focused regions. Based on this, the initial
In recent years, a more popular method that has been explored focused regions can be determined, and the morphological open-
is by using multiscale transforms (MST). There is evidence that ing and closing are employed for postprocessing. Then the focused
the human visual system (HVS) performs similar signal decom- regions in each image are identified and used to guide the fusion
position in its early processing [18]. Commonly used multiscale process in NSCT domain. Moreover, in order to avoid the artifacts or
transform include the Laplacian pyramid [19], curvelet [17], con- erroneous results introduced, the focused border regions are fused
tourlet [20] and nonsubsampled contourlet transform [21]. Relative by the choice of the corresponding subband coefficients according
to the block-based and focused regions-based fusion methods, to some new fusion rules. Finally, the fused image is reconstructed
the MST-based methods can successfully overcome their disad- by performing the inverse NSCT. The visual and quantitative anal-
vantages mentioned above, because coefficients in subbands, not ysis of the different fusion results prove that the proposed method
pixels or blocks in spatial region, are considered as image details improves the fusion quality and outperforms some exiting meth-
and selected out to compose fused image. That is the reason why ods.
many researchers would like to use MST in image fusion. As we The remaining sections of this paper are organized as follows:
know, one of the well-known MST methods for image fusion is Section 2 reviews the theory of the NSCT in brief. Section 3 presents
wavelet. the method of the focused regions detection. Section 4 describes the
However, wavelet has serious limitations in dealing with image fusion algorithm using the NSCT. Experimental results and
high dimensional signal like images, even though it can be discussion including performance analysis are given in Section 5.
seen as an optimal tool for analyzing one-dimensional (1-D) Finally, conclusions are given in Section 6.
piecewise smooth signals. As a tense-product of 1-D wavelet,
two-dimensional (2-D) separable wavelet is good at isolating the
discontinuities at object edges, but cannot effectively represent 2. Nonsubsampled contourlet transform
the ‘line’ and the ‘curve’ discontinuities. On the other hand, 2-D
separable wavelet decomposes image in only three direction high- In this section, we will briefly review the theory of the NSCT,
pass subbands, namely, vertical, horizontal and diagonal, and thus which will be used to decompose the source multi-focus images in
cannot represent the directions of the edges accurately. So wavelet- the paper.
based fusion scheme cannot preserve the salient features of source The NSCT is a shift-invariant version of the CT. In CT, the
images very well and will probably introduce some artifacts and Laplacian pyramid (LP) and the direction filter bank (DFB) are
inconsistency in the fused image. employed for multiscale decomposition and directional decompo-
Compared with the traditional wavelet transform, several multi- sition, respectively. To get rid of the frequency aliasing of the CT and
scale geometric analysis (MGA) tools, such as ridgelet [22], curvelet to achieve shift-invariance, Cunha, Zhou, and Do proposed NSCT
[17] and contourlet transform (CT) [20] can take full advantage of based on nonsubsampled pyramid decomposition and nonsubsam-
the geometric regularity of image intrinsic structures and obtain pled filter banks (NSFB). Fig. 1 shows the decomposition framework
the asymptotic optimal representation. As an MGA tool, contourlet of NSCT.
transform (CT) is a ‘true’ 2-D sparse representation for 2-D signals In NSCT, the multiresolution decomposition is realized by shift-
like images. It has the characteristics of localization, multidirection, invariant filter banks, which can reach the subband decomposition
and anisotropy. The CT can give the asymptotic optimal represen- structure similar to LP. This filter banks can be achieved by
tation of contours and has been successfully used in image fusion using two-channel nonsubsampled 2-D filter banks. Fig. 2 illus-
field [23]. However, the CT lacks of shift-invariance and results in trates the nonsubsampled pyramid decomposition with J = 3 stages.
pseudo-Gibbs phenomena along the edges to some extend. Rela- Such expansion is conceptually similar to the 1-D nonsubsampled
tive to the CT, nonsubsampled contourlet transform (NSCT) [21] wavelet transform (NSWT) computed with the àtrous algorithm
inherits the perfect properties of the CT, and meanwhile pos- [24].
sesses the shift-invariance. When the NSCT is introduced into the The nonsubsampled DFB (NSDFB) [25], as a shift-invariant ver-
image fusion field, more information for fusion can be obtained sion of the critically sampled DFB in the CT, is employed by the
and the impacts of mis-registration on the fused results can also NSCT. The NSDFB can be constructed by eliminating the down-
be reduced effectively [16]. So, the NSCT is more suitable for image sampler and upsampler in the DFB. To achieve multi-direction
fusion. decomposition, the NSDFB is iteratively used. Therefore, the NSCT
In image fusion algorithm in MST domain, one of the most has shift-invariant property because of the nonsubsampled opera-
important things for improving fusion quality is the selection tion.
of fusion rules, which influences the performance of fusion As can be seen from the above discussion, the NSCT not only
algorithm remarkably. Particularly, for multi-focus image fusion, retains the characteristics of contourlet, but also has other impor-
the key point is establishing a good measurement to deter- tant properties of the shift-invariance. When it is introduced into
mine which coefficient is in focus. However, it is often not image fusion, the size of different subbands is identical, so it is easy
possible to design a measurement which always successfully to find the relationship among different subbands, which is benefi-
distinguish the clear coefficients from the blurry parts. One cial for designing fusion rules, and the impacts of mis-registration
may use a better focus measurement and consistency verifi- on the fused results can also be reduced effectively [16]. Therefore,
cation as the remedy to improve the performance, and more the NSCT is more suitable for image fusion.
42 H. Li et al. / Optik 124 (2013) 40–51

Fig. 1. Nonsubsampled contourlet transform. (a) NSFB structure that implements the NSCT. (b) Corresponding frequency partitioning.

Fig. 2. The nonsubsampled pyramid is a 2-D multiresolution expansion similar to the 1-D NSWT. (a) Three-stage pyramid decomposition. The lighter gray regions denote
the aliasing caused by upsampling. (b) Subbands on the 2-D frequency plane.

3. Detection of focused regions Step 3 : The inverse LSWT is applied to the fused subbands to get
the initial fused image F.
3.1. Acquisition of initial fused image
3.2. Identification of the focused regions
The lifting stationary wavelet transform (LSWT) [13] is similar
to the lifting wavelet transform (LWT). It is a redundant scheme,
For multi-focus image, pixels with greater similarity to the cor-
as each set of coefficients contains the same number of samples as
responding initial fused image F pixels can be considered from the
the input. The LSWT provides the shift-invariance which the tradi-
focused regions. According to this property, we can identify the
tional lifting wavelet transform is short of. Compared with classical
focused regions properly by using root mean square error (RMSE)
(wavelet transform, WT) WT, LSWT possesses several advantages,
to measure the similarity of the source images (A and B) and the
including possibility of adaptive and nonlinear design, in place
initial fused image F. The detection of focused regions can be algo-
calculations, irregular samples and integral transform [26–28]. It
rithmically presented as following steps [29]:
can be seen as an alternate implementation of classical wavelet
(1) Calculate the RMSE of each pixel within a (2M + 1) × (2N + 1)
transform. The main feature of the lifting wavelet transform is
window between the source images (A and B) and the initial fused
that it provides an entirely spatial domain interpretation of the
transform, as opposed to the traditional frequency domain based
constructions [27]. So, we can use the LSWT as a conventional mul-
tiresolution image fusion method to obtain initial fused image. For
LSWT
the example of fusing two images (A and B), the overview schematic A
diagram of the initial fusion method is shown in Fig. 3. The input Fused
Fusion
images must be registered as prerequisite, so that the correspond- rule
Lowpass
Subband Inverse Fused
ing pixels are aligned. The fusion process is accomplished by the
following steps: Fused LSWT
imageF
Fusion
Step 1 : Decompose the source images A and B, respectively, into Highpass
rule
Subbands
lowpass subband coefficients and a series of highpass subbands LSWT
coefficients by LSWT at resolution level 4.
B
Step 2 : The lowpass subband and the highpass subbands coef-
ficients are simply merged by the ‘averaging’ scheme and the
‘absolute maximum choosing’ scheme, respectively. Fig. 3. Fusion scheme using the LSWT.
H. Li et al. / Optik 124 (2013) 40–51 43

Fig. 4. Fusion example. (a) Focus on the right; (b) focus on the left; (c) Z matrix of step (2) in Section 3.2; (d) detected focused region (Modified Z matrix of step (3) in Section
3.2); (e) fusion result using Foc-Det method.

image F using Eqs. (1) and (2), respectively. In the following part of the section, a set of multi-focus images
with perfect registration is used to test the performance of the
RMSE A (i, j) above fusion method.
 M N 1/2 Fig. 4 illustrates the source images and the fusion result obtained
m=−M n=−N
(IF (i + m, j + n) − IA (i + m, j + n))2 by the above method. Fig. 4(c) is the Z matrix of step (2) in Section
= (1)
(2M + 1) × (2N + 1) 3.2, and Fig. 4(d) is the detected focused region by modifying the
Z matrix. Fig. 4(e) is the fusion result by using the above method.
Focusing on the labeled regions in Fig. 4(e), one can obviously find
RMSE B (i, j) that the fused image contains many erroneous results at the bound-
 M N 1/2 ary of the focused regions. These effects significantly compromise
m=−M n=−N
(IF (i + m, j + n) − IB (i + m, j + n))2 the quality of the fused image, which is vital for human and machine
= (2) perceptions. Though one may use consistency verification as the
(2M + 1) × (2N + 1)
remedy to improve the performance, these effects could only be
where IA (i, j), IB (i, j) and IF (i, j) are the pixel values at the (i, j) suppressed to a certain degree but not completely. That is the rea-
coordinates of the sources images A, B and the initial fused image son why we do not use above method as the fusion method to fuse
F, respectively; (2M + 1) × (2N + 1) is a 5 × 5 window size which is the multi-focus images directly. Compared with the Foc-Det, our
determined according to the experimental results. proposed method can successfully overcome these disadvantages
(2) Compare the value RMSEA (i, j) and RMSEB (i, j) to determine by selecting the fusion coefficients of the focused border regions
which pixel is in focus. The logical matrix Z (essentially a binary according to some new selection principles in NSCT domain.
image) is constructed as
 4. The proposed fusion algorithm
1 if : RMSE A (i, j)  RMSE B (i, j)
Z(i, j) = (3)
0 if : RMSE A (i, j) > RMSE B (i, j) In order to improve the quality of the fused image, two different
focus measurements are presented and used to fuse the coefficients
where ‘1’ in Z indicates that the pixel at position (i, j) in image A is which located at the focused border regions, and the modified Z
in focus, otherwise the pixel in B is in focus. matrix is employed to guide the combination process of the coeffi-
(3) However, determination by RMSE alone is insufficient to dis- cients within the focused regions.
cern all the focused pixels. There are thin protrusions, thin gulfs,
narrow breaks, small holes, etc. in Z. Based on the theory of imag- 4.1. The fusion rule of the lowpass subband coefficients
ing, either the regions out of the focus or the regions on the focus
always are continuity in the interior of the regions. So the defects The coefficients in the coarsest scale subband represent the
mentioned above should be removed for a good fusion quality. To approximation component of the source image. The simplest way is
correct for these defects, morphological opening and closing with to use the conventional averaging method to produce the compos-
small structuring element are employed. Opening, denoted as Z ◦ B , ite coefficients. However, this will reduce the fused image contrast
is simply erosion of Z by the structure element B , followed by dila- greatly and then make some useful information of the source
tion of the result by B . It can remove thin connections and thin images lost. To improve the fused image quality, a clarity mea-
protrusions. However, closing can join narrow breaks and fill long sure should be defined to determine whether a coefficient of the
thin gulfs. It is dilation followed by erosion, and can be denoted lowpass subband is in focus or out of focus.
as Z • B . Unfortunately, holes larger than B cannot be removed In this paper, we introduce the concept of the image visibility
only using opening and closing operators. In practice, small holes (VI), which is inspired from the human visual system and defined
are always judged incorrectly; therefore, a threshold TH is set to as following for an M1 × N1 image I [5,29,31,32].
remove the holes smaller than the threshold [30]. Opening and
closing are again performed to smooth object contours. 1  1M1 N1  ˛ |I(i, j) − m |
k
For the proposed focused regions detection, the structure ele- VI = · (5)
M1 × N1 mk mk
ment B is a 4 × 4 matrix with logical 1’s and the threshold TH is set i=1 j=1
according to the experimental results. Now, the fused image FF can
where mk is the mean intensity value of the image, ˛ is a visual
be simply constructed by following equation
constant ranging from 0.6 to 0.7, and I(i, j) denotes the gray value

IA (i, j) if : Z  (i, j) = 1 of pixel at position (i, j).
IFF (i, j) = (4)
IB (i, j) if : Z  (i, j) = 0 We should mention that VI has much significance in multi-focus
image fusion than different sensor image fusion from extensive
where Z is the modified Z matrix of step (3) in Section 3.2, and IFF (i, experiments and has been successfully used in multi-focus image
j) is the pixel value at the (i, j) coordinate of the fused image FF. For fusion [29,31]. In order to represent the local clarity, we introduce
simplicity, we name this fusion method as Foc-Det method in this the concept of local visibility in NSCT domain, and use it to fuse the
paper. lowpass subband coefficients, which located at the focused border
44 H. Li et al. / Optik 124 (2013) 40–51

regions. When I(i, j) =


/ 0, the local neighborhood visibility (LV) is [16,23,34]. However, in those contrast measurements, the value
defined as (or the absolute value) of a single pixel in MST domain is used as
LV (i, j) the high frequency component. In fact, the value (or the absolute
value) of a single pixel is very limited in determining which pixel
Q
˛
1  P  1 |I(i + p, j + q) − I(i, j)| is from the clear parts of subimages. For multi-focus image fusion,
= · pixels with high variation are considered have more detail informa-
(2P + 1)(2Q + 1) I(i, j) I(i, j)
p=−P q=−Q tion, and then the pixels are more possible in focus. So, we believe
(6) it will be more reasonable to employ features, which can reflect
where the variation of the highpass subband pixels, as the high frequency
component of the contrast measurement, rather than the value (or
1  P 
Q
the absolute value) of a single pixel, to measure the contrast level.
I(i, j) = I(i + p, j + q) (7)
(2P + 1)(2Q + 1) The spatial frequency [35,36], which originated from the HVS,
p=−P q=−Q indicates the overall active level in an image, and can measure the
and (2P + 1) × (2Q + 1) is the local area size. When I(i, j) = 0 the LV(i, variation of pixels. It can effectively represent the focusing proper-
j) = I(i, j). ties of multi-focus image. The use of spatial frequency has led to an
Let ILA (i, j), ILB (i, j) and ILF (i, j) denote the lowpass subband coeffi- effective objective quality index for image fusion [4].
cients of source image A, B and fused image F, respectively, at the For an M1 × N1 image I, with the gray value at pixel position (i,
Lth scale and location (i, j). In this paper, we use a slipping window j) denoted by I(i, j), its spatial frequency is defined as
in Z to detect the focused border regions. The selection principle 
for the lowpass subband coefficients is finally defined as SF = RF 2 + CF 2 (11)
⎧ A
⎪ IL (i, j) if Z  (i, j) = 1 and z(i, j) = m1 n1 where RF and CF are the row frequency

⎪ I B (i, j) if Z  (i, j) = 0 and z(i, j) = 0 

⎨ L 
 1  
M1 N1
ILF (i, j) = I A (i, j) if 0 < z(i, j) < m n and LV AL (i, j) ≥ LV BL (i, j) RF =  (I(i, j) − I(i, j − 1))2 (12)


L 1 1
M1 N1

⎪ I B (i, j) if 0 < z(i, j) < m n
1 1 and LV AL (i, j) < LV BL (i, j)
⎩ L i=1 j=2

(8) and column frequency




where  1  
M1 N1

CF =  (I(i, j) − I(i − 1, j))2 (13)



(m1 −1)/2

(n1 −1)/2
M1 N1
z(i, j) = Z  (i + k1 , j + k2 ) (9) i=2 j=1

k1 =−(m1 −1)/2k2 =−(n1 −1)/2


respectively.
and m1 × n1 is the slipping window size. z(i, j) = m1 n1 means the In order to represent the local area activity level, we introduce
coefficient located at (i, j) in image A is in the focused regions, the concept of local area spatial frequency (LAF) in the NSCT domain,
and can be selected as the coefficient of the fused image directly; which is given by
whereas z(i, i) = 0 indicates the corresponding coefficient from 
image B is in the focused regions and can be selected to compose the LAF(i, j) = LRF 2 (i, j) + LCF 2 (i, j) (14)
coefficient of the fused image; in addition, 0 < z(i, j) < m1 n1 means
where
the coefficient located at (i, j) is in the focused border regions, and
can be fused according to the local visibility. Similarly, in order to LRF(i, j)
maximize the quality of the fused image, the differential evolution  M N2 1/2
algorithm, which is used to optimist the block size in literature [9], 2
m=−M2 n=−N2
(I(i + m, j + n) − I(i + m, j + n − 1))2
can be used to optimist m1 and n1 . = (15)
(2M2 + 1) × (2N2 + 1)

4.2. The fusion rule of highpass subband coefficients


and

The coefficients in the highpass subband represent the details LCF(i, j)


component of the source image. For the highpass subband  M N2 1/2
coefficient, the most commonly used selection principle is the 2
m=−M2 n=−N2
(I(i + m, j + n) − I(i + m − 1, j + n))2
‘absolute-maximum-choosing’ scheme (simplified ‘Coef-abs-max’) = (16)
(2M2 + 1) × (2N2 + 1)
without taking any consideration of lowpass coefficients, that is, all
the information in the lowpass subband is neglected. where (2M2 + 1) × (2N2 + 1) is the local area size.
According to physiological and psychological research, HVS is To further conform to the characteristics of HVS which is sen-
highly sensitive to the local image contrast level. To meet this sitive to local contrast change, edges, and directional features, etc.,
requirement, Toet and Ruyven developed the local luminance con- LAF is employed as one type of features of the highpass subband to
trast in their research in CP [33]. It defined as instead of the value of a single pixel in contrast measurement. The
L − Lb L directional feature contrast SRl,k (i, j) at lth scale, kth direction and
C= = (10)
Lb Lb location (i, j) is defined as
where L denotes the local gray level, Lb is the local brightness of LAF l,k (i, j)
the background and corresponds to the low frequency component. SRl,k (i, j) = (17)
I l (i, j)
Therefore, L can be taken as the high frequency component.
Up until now, many different forms contrast measurements where I l (i, j) is the local area mean around the pixel (i, j) of the low-
have been proposed for multimodal image fusion based on this idea, pass subband image Il at the lth scale. It can be calculated according
and provide a better performance than the ‘Coef-abs-max’ scheme to Eq. (18). In practice, to reduce the computation complexity, Il (i,
H. Li et al. / Optik 124 (2013) 40–51 45

j) can be substituted with the coarsest lowpass subband image IL (i, A B


j).

1 
M2

N2

I l (i, j) = Il (i + m, j + n) (18)
(2M2 + 1)(2N2 + 1)
m=−M2 n=N2

The pixel with higher SRl,k (i, j) value is more likely to correspond NSCT LSWT LSWT NSCT
to important visual information, then the pixel at the location (i, j) is
more possible in focus. Therefore, the proposed selection principle
for the highpass subband coefficients is finally defined as
Coefficients Coefficients Coefficients Coefficients
⎧A


Il,k (i, j) if Z  (i, j) = 1 and z(i, j) = m1 n1
⎨ Il,k
B
(i, j) if Z  (i, j) = 0 and z(i, j) = 0
F
Il,k (i, j) = I A (i, j) if 0 < z(i, j) < m1 n1 and SRAl,k (i, j) ≥ SRBl,k (i, j) (19)


l,k
B
SRAl,k (i, j) < SRBl,k (i, j)
⎩ Il,k (i, j) if 0 < z(i, j) < m1 n1 and
Initial fused
image
A (i, j), I B (i, j) and I F (i, j) denote the highpass subband
where, Il,k l,k l,k
coefficients of source image A, B and fused image F, respectively, at
the lth scale, kth direction and location (i, j). z(i, j) is defined as (9),
and have the same meaning with Eq. (9). Detect focused
We should mention that, results of the proposed method are regions
given for a 3 × 3 local area size for estimating the local visibility
and the directional feature contrast, since such area size provides
good fusion performance in most cases. Fusion rules

4.3. Proposed fusion steps


Fused
The proposed multi-focus image fusion method is illustrated Coefficients
in Fig. 5, and the fusion process is accomplished by the following
steps:
Step 1: Decompose the source images A and B, respectively, into Inverse
one lowpass subband and a series of highpass subbands at L levels NSCT
and k directions via NSCT.
Step 2: Detect focused regions of the source images by the pro-
posed method. Meanwhile, measure the LV and SR in a slipping
window of the coefficients which located at the focused border Fused image
regions in lowpass subband and each highpass subband, respec-
tively. Fig. 5. Schematic diagram of the proposed image fusion method.
Step 3: Select fusion NSCT coefficients for the lowpass subband
and each highpass subband from A and B according to (8) and (19),
respectively. in the DWT-based and LSWT-based methods. Four decomposition
Step 4: Reconstruct the original image based on the new fused levels, with 4, 8, 8, 16, directions from coarser scale to finer scale,
coefficients of subbands by taking an inverse NSCT transform, then are used in the NSCT-simple-based method, the NSCT-contrast-
the fused image is obtained. based method and our proposed fusion method. In addition, the
fusion scheme of the lowpass subband coefficients and the high-
5. Experimental results and analysis pass subband coefficients of the literature [16] are employed in
NSCT-contrast-based method.
In order to show the advantages of the new method,
we establish two steps to demonstrate the proposed method 5.1. Comparisons of fusion rules in NSCT domain
outperforms other fusion methods. Firstly, the ‘directional-feature-
contrast-maximum-choosing’ (simplified into Fea-Con-max) rule In this section, we will show why the ‘Fea-Con-max’ fusion
is compared with other typical fusion rules such as ‘Coef-abs- rule could improve the fusion performance. To simplify the dis-
max’ scheme and the ‘traditional contrast maximum choosing’ cussion, the ‘Fea-Con-max’ rule, the ‘Tra-Con-max’ rule and the
scheme [16] (simplified into Tra-Con-max) which use the abso- ‘Coef-abs-max’ rule are compared on high-frequency subbands in
lute value of a single pixel as the high frequency component in NSCT domain. Moreover, the labeled parts of Fig. 6(a) and (b) are
the contrast measurement, to demonstrate the performance of the utilized as an example to demonstrate the ‘Fea-Con-max’ rule could
proposed directional feature contrast measurement. Secondly, the improve the fusion performance.
proposed fusion algorithm is compared with other typical fusion Fig. 7(a) and (b) show the high frequency subbands of the labeled
methods include DWT-based method, LSWT-based method, NSCT- regions in Fig. 6(a) and (b), respectively, in NSCT domain. One can
simple-based method, NSCT-contrast-based method [16], and the see that the values of coefficients in clear part are greater than
block-based SML (BBS) method (with 8 × 8 blocks) [7]. In the first those of blurry part. That is why typical ‘Coef-abs-max’ is used in
three methods, the lowpass subband coefficients and the highpass traditional MST-based fusion algorithms.
subband coefficients are simply merged by the ‘averaging’ scheme Fig. 7(c)–(e) shows the decision maps of ‘Coef-abs-max’, ‘Tra-
and the ‘Coef-abs-max’ scheme, respectively. The ‘db5’ wavelet and Con-max’ and ‘Fea-Con-max’ rules in which the white color
‘db53’ wavelet, together with a decomposition level of 4, are used indicates that coefficients are selected from Fig. 7(b), otherwise
46 H. Li et al. / Optik 124 (2013) 40–51

Fig. 6. Original multi-focus ‘pepsi’ images. (a) Focus on the left; (b) focus on the right; (c) Z matrix of step 2 in Section 3.2; (d) detected focused regions (Modified Z matrix
of step 3 in Section 3.2).

Fig. 7. Comparison of Coef-abs-max, Tra-Con-max, Fea-Con-max rules. (a) and (b) are the high frequency subbands of the labeled part in Fig. 6(a) and (b), respectively. (c)–(e)
are decision maps of Coef-abs-max, Tra-Con-max and Fea-Con-max rules, respectively.

selected from Fig. 7(a). Since labeled part of Fig. 6(b) is clearer than 5.2. Multi-focus image fusion
that of Fig. 6(a), the optimal decision map should be in white color
in the whole decision map, which means all coefficients should In this section, two sets of images with perfect registration and
be selected from Fig. 7(b). However, the decision maps of ‘Coef- one set of images with mis-registration are used to evaluate the
abs-max’ and ‘Tra-Con-max’, shown in Fig. 7(c) and (d), indicate proposed fusion algorithm.
these rules do not select the coefficients from the clear part com- The first experiment is performed on one set of perfectly reg-
pletely even though the ‘Tra-Con-max’ shows better performance istered multi-focus source images. As shown in Fig. 4(a) and (b),
than ‘Coef-abs-max’ rule. Fig. 7(e) shows that the proposed direc- each image contains multiple objects at different distances from
tional feature contrast is the best measurement, which means using the camera. The focus in Fig. 4(a) is on the clock, while that
‘Fea-Con-max’ selection principles in NSCT domain could produce in Fig. 4(b) is on the book. The initial and modified versions of
the best fusion result for multi-focus image fusion. Therefore, it is detected focused regions are shown in Fig. 4(c) and (d), respec-
reasonable to use SF as one type of features of the highpass sub- tively. The bright pixels indicate that corresponding coefficients
bands to replace the absolute value of a single pixel in contrast from Fig. 4(a) are in focused regions, whereas the black pixels indi-
measurement. cate that corresponding coefficients from the image in Fig. 4(b) are
The results of objective assessment are shown in Fig. 8. In Fig. 8, in focused regions. Fig. 9(f) is the fusion result by using the proposed
‘From a’ and ‘From b’ denote the number of pixels come from method. Fig. 9(a)–(e) are the fused images by using the DWT-based
Fig. 7(a) and (b), respectively. Obviously, the proposed fusion rule method, the LSWT-based method, the NSCT-simple-based method,
is superior to others because the number of pixels from Fig. 7(b) is the NSCT-contrast-based method [16] and the BBS method [7],
the largest. respectively. To make better comparisons, the difference images
between the fused images, which shown in Fig. 9(a)–(f), and the
source image in Fig. 4(b) are given in Fig. 9(g)–(l). For the focused
regions, the difference between the source image and the fused
image should be zero. So the lower residue features in the difference
image means the method convey more useful information from the
source images to the fused image when compared with the other
methods.
Focusing on the labeled region in Fig. 9(g)–(i), one can obvi-
ously find that the fused images of LSWT and NSCT methods are
clearer than the DWT fused result. Moreover, the fused result of
Froma the DWT-based method, as shown in Fig. 9(g), introduces many
Fromb ‘artifacts’ around edges because the DWT lacks shift-invariance. It
is proven that shift-invariant methods can overcome the pseudo-
Gibbs phenomena successfully and improve the quality of the
fused image. Fig. 9(j) and (k) indicate that the NSCT-contrast-based
method and BBS method can provide better performance in fusion
multi-focus images compared with DWT-based, LSWT-based, and
NSCT-simple-based methods. However, from Fig. 9(j) we find that
Coef-abs-maxTra-Con-maxFea-Con-max
the NSCT-contrast method does not extract all the useful informa-
Fig. 8. Performance of different fusion rules. tion of the source images and nor transfer it to the fused image.
H. Li et al. / Optik 124 (2013) 40–51 47

Fig. 9. The ‘Disk’ fused images of different methods. (a)–(f) are the fused images using DWT-based method, LSWT-based method, NSCT-simple-based method, NSCT-contrast-
based method, BBS-based method and the proposed method, respectively. (g)–(l) are the difference images between (a)–(f) and Fig. 4(b), respectively.

In addition, from the labeled region in Fig. 9(k) we can see that The initial and modified versions of detected focused regions
the fused image of BBS presents block effect which will affect the are shown in Fig. 6(c) and (d), respectively. The bright pixels indi-
appearance of fused image. Fig. 9(l) indicates that almost all the cate that the corresponding coefficients from the image in Fig. 6(b)
useful information of the source images has been transferred to are in focused regions, whereas the black pixels indicate that cor-
the fused image, meanwhile, fewer artifacts are introduced dur- responding coefficients from the image in Fig. 6(a) are in focused
ing the fusion process. These comparisons demonstrate that our regions. The fusion results obtained by the DWT, LSWT, NSCT, BBS
proposed fusion method not only inherits the properties of the Foc- and our proposed methods are shown in Fig. 10(a)–(f), respec-
Det-based method but also avoids producing the erroneous results tively. For clearer comparison, the difference images between the
at the boundary of focused regions. fused images, which are fused results by using the above different
In order to further evaluate the fusion performance, the second methods, and the source image in Fig. 6(b) are given in Fig. 10(g)–(l).
experiment is performed on another set of multi-focus images. As As illustrated in Fig. 10, especially Fig. 10(g)–(l) indicates that the
shown in Fig. 6(a) and (b), each image contains multiple objects at proposed method can extract almost all the good focalized part
different distances from the camera. The focus in Fig. 6(a) is on the of the source images and preserves the useful information better
pepsi can, while that in Fig. 6(b) is on the card. than the other methods, meanwhile, fewer artifacts and erroneous
48 H. Li et al. / Optik 124 (2013) 40–51

Fig. 10. The ‘Pepsi’ fused images of different methods. (a)–(f) are the fused images using DWT-based method, LSWT-based method, NSCT-simple-based method, NSCT-
contrast-based method, BBS-based method and the proposed method, respectively. (g)–(l) are the difference images between (a)–(f) and Fig. 6(b), respectively.

results are introduced during the fusion process. Moreover, all One set of source images with mis-registration, as shown in
the coefficients of the focused border regions are successfully Fig. 11(a) and (b), are used to further evaluate the fusion perfor-
selected to compose the coefficients of the fused image in the NSCT mance in the third experiment. The focus in Fig. 11(a) is on the
domain. clock, while that in Fig. 11(b) is on the student. As can be seen from
H. Li et al. / Optik 124 (2013) 40–51 49

Fig. 11. The multi-focus ‘Lab’ images and the detected focused regions. (a) Focus on the clock; (b) focus on the student; (c) Z matrix of step 2 in Section 3.2; (d) detected
focused regions (modified Z matrix of step 3 in Section 3.2).

Fig. 12. The ‘Lab’ fused images of different methods. (a)–(f) are the fused images using DWT-based method, LSWT-based method, NSCT-simple-based method, NSCT-contrast-
based method, BBS-based method and the proposed method, respectively. (g)–(l) are the difference images between Fig. 12(a)–(f) and Fig. 11(b), respectively.
50 H. Li et al. / Optik 124 (2013) 40–51

Table 1 6. Conclusions
Evaluation of fusion results by mutual information (MI) performance.

Methods Disk Pepsi Lab In this paper, a new method for multi-focus image fusion, which
DWT-based 5.4628 6.4711 6.6298 combines the transformed domain method with spatial domain
LSWT-based 5.8967 6.7618 7.0317 method, is presented. The underlying advantages include: (1) All
NSCT-simple-based 5.9080 6.7756 7.0614 of the coefficients within the focused regions can be success-
NSCT-contrast-based 6.3680 7.1327 7.1990 fully selected to compose the coefficients of the fused image in
BBS-based 8.1012 8.4728 8.4909
NSCT domain. (2) The drawbacks of the focused regions based
Proposed method 8.0970 8.7862 8.6365
methods, which may introduce artifacts or erroneous results at
the boundaries of the focused regions during the fusion pro-
cess, can be successfully overcome in our proposed method. (3)
Table 2 Using the detected focused regions to guide the fusion process can
Evaluation of fusion results by QAB/F .
increase the reliability of the fusion method. (4) The capability of
Methods Disk Pepsi Lab avoiding imperfections such as mis-registration is ensured. The
DWT-based 0.6258 0.7282 0.6769 experimental results on several pairs of multi-focus images have
LSWT-based 0.6685 0.7544 0.7103 demonstrated the superior performance of the proposed fusion
NSCT-simple-based 0.6736 0.7582 0.7133 scheme.
NSCT-contrast-based 0.6838 0.7654 0.7179
BBS-based 0.7270 0.7868 0.7518
Proposed method 0.7310 0.7916 0.7555 Acknowledgements

The paper is jointly supported by the National Natural Science


Foundation of China (No. 60974090), the Ph.D. Programs Founda-
tion of Ministry of Education of China (No. 200806110016), and
Fig. 11(a) and (b), there is a slight movement of the student’s head
the Fundamental Research Funds for the Central Universities (No.
in Fig. 11(a) and (b). The initial and modified versions of detected
CDJXS10172205).
focused regions are shown in Fig. 11(c) and (d), respectively. The
fused images of different methods are shown in Fig. 12(a)–(f). Again,
a clearer comparison can be made by examining the differences References
between the fused images and the source image which shown in
[1] S. Gabarda, G. Cristóbal, On the use of a joint spatial-frequency representation
Fig. 11(b). From Fig. 12, especially Fig. 12(g)–(l) we can conclude for the fusion of multi-focus images, Pattern Recognition Letters 26 (16) (2005)
that the proposed algorithm is with higher performance and more 2572–2578.
robust to the mis-registration. [2] O. Rockinger, Pixel-level fusion of image sequences using wavelet frames, in:
Proceedings of the 16th Leeds Applied Shape Research Workshop, Leeds Uni-
For further comparison, besides visual observation, two objec-
versity Press, 1996.
tive criteria are used to compare the fusion results. The first [3] P.W. Huang, C.I. Chen, P.L. Lin, Multi-focus image fusion based on salient edge
criterion is the mutual information (MI) [37]. It is a metric defined information with adaptive focus-measuring windows, in: The IEEE Interna-
tional Conference on Systems, Man, and Cybernetics, 2009, pp. 2589–2594.
as the sum of mutual information between each input image and
[4] S.T. Li, B. Yang, Multifocus image fusion using region segmentation and spatial
the fused image. The second criterion is QAB/F [38] metric, proposed frequency, Image and Vision Computing 26 (7) (2008) 971–979.
by Xydeas and Petovic, which considers the amount of edge infor- [5] S.T. Li, J.T. Kwok, Y.N. Wang, Multifocus image fusion using artificial neural
mation transferred from the input images to the fused image. This networks, Pattern Recognition Letters 23 (8) (2002) 985–997.
[6] I.H. De, B.B. Chanda, B.H. Chattopadhyay, Engancing effective depth-of-field by
method uses a Sobel edge detector to calculate strength and ori- image fusion using mathematical morphology, Image and Vision Computing
entation information at each pixel in both source and the fused 24 (12) (2006) 1278–1287.
images. For both criteria, the larger the value, the better is the fusion [7] H. Wei, Z.L. Jing, Evaluation of focus measures in multi-focus image fusion,
Pattern Recognition Letters 28 (4) (2007) 493–500.
result. [8] Y.J. Zhang, L.L. Ge, Efficient fusion scheme for multi-focus images by using
The values of MI and QAB/F of Figs. 9(a)–(f), 10(a)–(f), and blurring measure, Digital Signal Processing 19 (2) (2009) 186–193.
12(a)–(f) are listed in Tables 1 and 2, respectively. From [9] V. Aslanta, R. Kurban, Fusion of multi-focus images using differential evolution
algorithm, Expert Systems with Applications 37 (12) (2010) 8861–8870.
Tables 1 and 2, we observe that different fusion methods appear [10] V.S. Petrovic, C.S. Xydeas, Gradient-based multiresolution image fusion, IEEE
to provide different image fusion performance. Although in ‘disk’ Transactions on Image Processing 13 (2) (2004) 228–237.
image fusion, mutual information of BBS is slightly larger than our [11] S.T. Li, B. Yang, Hybrid multiresoulution method for multisensor multimodal
image fusion, IEEE Sensors Journal 10 (9) (2010) 1519–1525.
proposed method, the visual appearance of BBS fused image is not
[12] S.Y. Yang, M. Wang, Y.X. Lu, W. Qi, L.C. Jiao, Fusion of multiparametric SAR
obviously good, because the fused image of BBS presents the block images based on SW-nonsubsampled contourlet and PCNN, Signal Processing
effect, which affects the appearance of the fused image seriously. By 89 (12) (2009) 2596–2608.
[13] Y. Chai, H.F. Li, M.Y. Guo, Multifocus image fusion scheme based on features
considering the visual effect and the objective evaluation results,
of multiscale products and PCNN in lifting stationary wavelet domain, Optics
the conclusion that the proposed method is with the highest per- Communications 284 (5) (2011) 1146–1158.
formance, even if the source images are not well registered can be [14] G. Pajares, J. Cruz, A wavelet-based image fusion tutorial, Pattern Recognition
obtained. 7 (9) (2004) 1855–1872.
[15] X.B. Qu, J.W. Yan, H.Z. Xiao, et al., Image fusion algorithm based on spatial
Furthermore, objective criteria on MI and QAB/F , which shown in frequency-motivated pulse coupled neural networks in nonsubsample con-
Tables 1 and 2, respectively, indicate that the NSCT-simply-based tourlet transform domain, Acta Automatica Sinica 34 (12) (2008) 1508–1514.
method transferred more information to the fused images than that [16] Q. Zhang, B.L. Guo, Multifocus image fusion using the nonsubsampled con-
tourlet transform, Signal Processing 89 (2009) 1334–1346.
of the DWT-based and LSWT-based methods. Through the com- [17] S.T. Li, B. Yang, Multifocus image fusion by combining curvelet and wavelet
parison, it can be concluded that NSCT is the best MST method. transform, Pattern Recognition Letters 29 (9) (2008) 1295–1301.
So, the NSCT is used as the MST method in our proposed method. [18] D.L. Donho, A.G. Flesia, Can recent innovations in harmonic analysis ‘explain’
key findings in natural image statistics, Network: Computation in Neural Sys-
Besides, it can be observed that the fusion performance of the LSWT tems 12 (3) (2001) 371–393.
is similar with the NSCT. However, theoretically, the computational [19] P.T. Burt, E.H. Andelson, The Laplacian pyramid as a compact image code, IEEE
complexity of the LSWT is much lower than that of the NSCT. That Transactions on Communications 31 (4) (1983) 532–540.
[20] M.N. Do, M. Vetterli, The contourlet transform: an efficient directional mul-
is why the LSWT rather than the NSCT is used to generate the initial
tiresolution image representation, IEEE Transactions on Image Processing 14
fused image. (12) (2005) 2019–2106.
H. Li et al. / Optik 124 (2013) 40–51 51

[21] A.L. da Cunha, J.P. Zhou, M.N. Do, The nonsubsampled contourlet transform: [30] T. Stathaki, Image fusion algorithms and applications, Academic Press is an
theory, design, and applications, IEEE Transaction on Image Processing 15 (12) imprint of Elsevier, 2008.
(2006) 3089–3101. [31] M. Li, W. Cai, Z. Tan, A region-based multi-sensor image fusion scheme
[22] E.J. Candes, Rigelets: Theory and Applications, Ph.D. Dissertation, Department using pulse-coupled neural network, Patter Recognition Letters 27 (16) (2006)
of Statistics, Stanford University, 1998. 1948–1956.
[23] L. Yang, B.L. Guo, W. Li, Multimodality medical image fusion based on multiscale [32] J.W. Huang, Y.Q. Shi, X.H. Dai, A segmentation-based image coding algorithm
geometric analysis of contourlet transform, Neurocomputing 72 (1) (2008) using the features of human vision system, Journal of Image and Graphics 4 (5)
203–211. (1999) 400–404.
[24] M.J. Shensa, The discrete wavelet transform: wedding the trous and mal- [33] A. Toet, L.J. Van Ruyven, J.M. Valeton, Merging thermal and visual
lat algorithms, IEEE Transactions on Signal Processing 40 (10) (1992) images by a contrast pyramid, Optical Engineering 28 (7) (1989)
2464–2482. 789–792.
[25] R.H. Bamberger, M.J.T. Smith, A filter bank for the directional decomposition [34] Q. Zhang, B.L. Guo, Fusion of multi-sensor images based on the non-
of images: theory and design, IEEE Transactions on Signal Processing 40 (4) subsampled contourlet transform, Act Automatica Sinica 34 (2) (2008)
(1992) 882–893. 135–141.
[26] W. Sweldens, The lifting scheme: a custom-design construction of biorthogonal [35] A.M. Eskicioglu, P.S. Fisher, Image quality measures and their performance, IEEE
wavelets, Applied and Computational Harmonic Analysis 3 (2) (1996) 186–200. Transactions on Communications 43 (12) (1995) 2959–2965.
[27] W. Sweldens, The lifting scheme: a construction of second generation wavelets, [36] V. Aslantas, R. Kurban, A comparison of criterion functions for fusion
SIAM Journal on Mathematical Analysis 29 (2) (1998) 511–546. of multi-focus noisy images, Optics Communications 282 (16) (2009)
[28] R.L. Claypoole, G.M. Davis, W. Sweldens, R. Baraniuk, Nonlinear wavelet trans- 3231–3242.
forms for image coding via lifting, IEEE Transactions on Image Processing 12 [37] G. Qu, D. Zhang, P. Yan, Information measure for performance of image fusion,
(12) (2003) 1449–1459. Electronics Letters 38 (7) (2002) 313–315.
[29] Y. Chai, H.F. Li, Z.F. Li, Multifocus image fusion scheme using focused [38] V. Petrovic, C. Xydeas, On the effects of sensor noise in pixel-level image fusion
region detection and multiresolution, Optics Communications 284 (19) (2011) performance, in: Proceedings of the Third International Conference on Image
4376–4389. Fusion, vol. 2, IEEE, Paris, France, 2000, pp. 14–19.

You might also like