You are on page 1of 14

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMM.2020.2969790, IEEE
Transactions on Multimedia
IEEE TRANSACTIONS ON MULTIMEDIA 1

Low-Light Image Enhancement with


Semi-Decoupled Decomposition
Shijie Hao, Xu Han, Yanrong Guo, Xin Xu, and Meng Wang, Senior Member, IEEE

Abstract—Low-light image enhancement is important for high- condition, including the low light condition, the back light
quality image display and other visual applications. However, it condition, and the mixed condition. A low light source is
is a challenging task as the enhancement is expected to improve not able to illuminate the whole scene, and thus leads to a
the visibility of an image while keeping its visual naturalness.
Retinex-based methods have well been recognized as a represen- globally dark appearance of the photograph. Differently, the
tative technique for this task, but they still have the following back light usually causes the light source and its nearby region
limitations. First, due to less-effective image decomposition or over-exposed while the rest image regions under-exposed.
strong imaging noise, various artifacts can still be brought into Moreover, the mixed situations also exist, such as an image
enhanced results. Second, although the priori information can taken against streetlights at a dark street. Therefore, it is
be explored to partially solve the first issue, it requires to
carefully model the priori by a regularization term and usually important to accurately estimate the illumination distribution
makes the optimization process complicated. In this paper, we under these complex conditions. The second obstacle is the
address these issues by proposing a novel Retinex-based low- noise hidden in the dark image regions. In these regions,
light image enhancement method, in which the Retinex image the intensity of imaging noise is usually on par with the
decomposition is achieved in an efficient semi-decoupled way. intensity of fine edges and textures. Therefore, over-amplified
Specifically, the illumination layer I is gradually estimated only
with the input image S based on the proposed Gaussian Total noise can be possibly introduced as the byproduct during the
Variation model, while the reflectance layer R is jointly estimated enhancing process, which greatly degrades the visual quality.
by S and the intermediate I. In addition, the imaging noise In this context, in addition to lightening dark regions, noise
can be simultaneously suppressed during the estimation of R. suppression should also be considered in the enhancement
Experimental results on several public datasets demonstrate that model.
our method produces images with both higher visibility and
better visual quality, which outperforms the state-of-the-art low- Currently, the Retinex image representation provides a solid
light enhancement methods in terms of several objective and and flexible framework for the low-light enhancement task. In
subjective evaluation metrics. the Retinex theory, an original image S can be represented
Index Terms—Low-light images, Image enhancement, Retinex as the product of an illumination layer I and a reflectance
model. layer R. Generally, the Retinex-based methods boil down to
effectively solving the ill-posed I and R decomposition [1]–
[3]. On the one hand, the layer I represents the distribution of
I. I NTRODUCTION the scene illumination, and spatially determines the darkened
regions in the scene. On the other hand, the layer R represents
I MAGES taken under imperfect light conditions usually
have low contrast, unclear details and strong noise. These
visual effects are undesirable, and make images less appealing.
the material properties of the scene surface, and is assumed to
be invariant. With I and R, the low-light enhancement can be
Therefore, it is valuable to enhance images in terms of simply realized as I γ R, where I γ is the Gamma Correction
correcting contrast, recovering details and suppressing noise. that non-linearly remaps the illumination distribution (γ is
Generally, this task can be called as low-light enhancement, empirically set as 1/2.2 in most cases). To further improve
contrast enhancement, or exposure correction. The fundamen- the visual quality of enhanced images, it is feasible to design
tal goal of this task is to enhance the visual quality of an some new constraints, such as priors of shape [4], texture [4],
image. [5], exposure [5], or even noise [6], and encode them into the
There are two main challenges in obtaining a satisfying decomposition model. However, limitations still exist in these
enhanced image. The first one is the complex illumination Retinex-based methods. For example, the artifacts, such as
over-amplified noise and edge reversal, can still be introduced
Manuscript received September 22, 2019; revised December 12, 2019; during the enhancement. By introducing more constraints, the
accepted January 7, 2020. The research was supported in part by the National decomposition process tends to be more coupled and mutually
Key Research and Development Program under Grant No. 2018YFB0804203,
and in part by the National Nature Science Foundation of China under Grant affect each other. This often leads to more iterations toward
No. 61772171, 61702156, and 61632007. convergence in the optimization process.
S. Hao, X. Han, Y. Guo, and M. Wang are with Key Laboratory of In this paper, we make a simple two-fold assumption: an
Knowledge Engineering with Big Data (Hefei University of Technology),
Ministry of Education, and School of Computer Science and Informa- ideal illumination layer I should be sufficiently piecewise-
tion Engineering, Hefei University of Technology, Hefei 230009, China smooth, while an ideal reflectance layer R should be noise-
(e-mail: hfut.hsj@gmail.com; xuhan@mail.hfut.edu.cn; yrguo@hfut.edu.cn; free, and contain as many fine-scale details as possible. These
eric.mengwang@gmail.com).
X. Xu is with School of Computer Science and Technology, Wuhan Univer- two folds are highly related to each other. The piecewise-
sity of Science and Technology, Wuhan, China (e-mail: xuxin@wust.edu.cn). smooth I layer only contains large-scale structures and flat

1520-9210 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Glasgow. Downloaded on June 02,2020 at 11:36:10 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMM.2020.2969790, IEEE
Transactions on Multimedia
IEEE TRANSACTIONS ON MULTIMEDIA 2

regions, which leaves all the small-scale contents and possible its advantage in the fast implementation speed. Moreover, their
imaging noise to the obtained R layer. According to the above enhanced results can be used as a complementary input for the
analysis, we introduce a modified Retinex model. Based on fusion based methods [10].
that, we propose a novel two-stage model to estimate I and Retinex theory assumes that an image can be represented
R. 1) We first estimate I via a proposed edge-preserving by an illumination layer multiplying a reflectance layer. Full
image filter based on Gaussian Total Variation (GTV). 2) Retinex-based methods decompose the input image into these
Then, we estimate R under the Retinex constraint, as well two layers, and then combine them again after a Gamma
as a regularization term on R for dealing with the imaging correction on the illumination layer. Fu et al. [2] proposed
noise N . By alternatively implementing these two stages, I a weighted variational model to decompose the illumination
and R are gradually refined until convergence, and the possible layer and the reflectance layer simultaneously. By building
imaging noise N can be simultaneously suppressed during the local variation deviation, Cai et al. [4] introduced the illu-
refinement of R. mination, shape and texture prior into the Retinex model.
Our decomposition can be considered as a semi-decoupled In [11], an illumination estimation algorithm based on a
model, as the first stage does not involve reflectance, while joint edge-preserving filter is proposed for the naturalness-
the second stage involves both illumination and reflectance. preserving Retinex decomposition. Yue et al. [3] encoded a
In this perspective, our decomposition model lies between color constraint on the reflectance layer as well as a smooth
the coupled models [1]–[4], [6] and the decoupled models constraint on the illumination layer. Here, the illumination and
[7], [8] that only estimates I. Compared with these works, the reflectance layer are obtained via alternative optimization.
the semi-decoupled model not only ensures the quality of There also exists a branch of simplified Retinex methods,
the enhancement results, but also facilitates a relatively fast which directly considers the reflectance as the enhanced image
convergence rate of the optimization. The feasibility of the [7], [8]. However, the simplified Retinex methods are prone
semi-decoupled model is as follows. On the one hand, a well- to cause over-enhanced or unrealistic effects. To relief this
performed edge-preserving filter is qualified for satisfying the issue, Zhang et al. [5] encoded the color and texture prior into
piecewise-smoothness assumption on I. On the other hand, the process of illumination layer estimation. The simplified
if we bring R directly into the refinement of I, it would Retinex-based methods [5], [7], [8] can be regarded as a fully
potentially introduce negative impact, since the assumption on decoupled process, as they only focus on the estimation of
reflectance is totally different from that of illumination. the illumination layer I. A common limitation of the Retinex-
The contribution of our method includes three aspects. First, based methods lies in that they do not explicitly tackle the
we propose an edge-preserving image filter based on Gaussian imaging noise problem. Therefore, two post-processing steps
Total Variation, which achieves good performance on refining are used in [8]. The enhanced image is denoised by the BM3D
I in terms of piecewise smoothing and texture removal. [12] model , and then linearly fused with the original image.
Second, based on a modified Retinex model, we propose a Differently, Li et al. [6] extended the traditional Retinex model
semi-decoupled decomposition model for estimating I and R, by adding a noise term N , i.e. S = I ◦ R + N . Based
which satisfies on both the high decomposition quality and the on this model, Ren et al. [13] added a spatially smoothing
fast convergence. Third, via equipping a simple but effective constraint and sequentially decomposed the input image. Our
denoising term on R, our model can be easily extended method also considers the image noise but differentiates in that
into an imaging-noise-suppression version. By validating on we assume the imaging noise is mixed with the decomposed
several public datasets for low-light enhancement, our method reflectance layer i.e. S = I ◦ (R + N ), which facilitates the
obtains good performances in both qualitative and quantitative more effective semi-decoupled decomposition process.
evaluations. Multiple source fusion is another roadmap for the image
The rest of the paper is organized as follows. Section II enhancement task, which takes the advantage of complemen-
briefly introduces the related works on low-light enhancement. tary information in a same scene provided by multi-exposure
In Section III and Section IV, we presents the Gaussian images [14], [15]. Ma et al. [16] designed a multiple exposure
Total Variation and the semi-decoupled decomposition model fusion model, which fuses images at the patch level. Kinoshita
respectively. Experimental results and analysis are shown in and Kiya [17] fused multi-exposure images under the guidance
Section V. In Section VI, we conclude our research and discuss of scene brightness. In many real-world applications, this
several possible future directions. roadmap is possibly limited due to the lack of fusion source.
To solve this issue, a useful strategy is to generate several
intermediate enhanced images as fusion sources. The models
II. R ELATED W ORK
for mimicking fusion source is an open problem. Its key issue
In this section, we briefly review the low-light image lies in the seamless fusion process. Typically, Fu et al. [10]
enhancement methods, which can be generally divided into adopted histogram equalization and nonlinear curve reshaping,
histogram-based, Retinex-based, fusion-based and learning- and then fused the enhanced images based on Laplacian image
based models. pyramid. Hao et al. [18] used simplified Retinex model to
Histogram-based methods [9] aim at optimizing the shape generate a fusion source, and then fused it with the original
of an image histogram, which only counts pixel intensities image. A light- and structure-aware weight map is designed
but ignores their spatial information. Therefore, this model for the fusion.
tends to produce over- or under-enhancement effects, but has Recently, deep neural networks have emerged as a useful

1520-9210 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Glasgow. Downloaded on June 02,2020 at 11:36:10 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMM.2020.2969790, IEEE
Transactions on Multimedia
IEEE TRANSACTIONS ON MULTIMEDIA 3

tool in the task of exposure correction. Lore et al. [19] The L1-norm in the GTV regularization term brings diffi-
proposed the seminal work based on the auto-encoder model. culty in solving Eq. (2). To address this problem, we make
However, it is limited in preserving the fine image details. the following approximation on the L1-norm:
In recent years, Convolutional Neural Network (CNN) has
been broadly applied [20]–[22], where skip connection is used ∇x,y T (∇x,y T )2
to preserve image details. Generally, success of the deep- (∇x,y Gσ2 (T ))2
= (∇x,y Gσ2 (T ))2
exp( 2σ12
) (∇x,y T )exp( 2σ12
)
learning-based methods is attributed by the following aspects. 1 1
The first one is the network ability that learns a content- 1 2
≈ (∇ G (T ))2
k∇x,y T k2 , (3)
and lightness-aware mapping function between the pairwise max(|∇x,y T | , )exp( x,y2σσ22 )
1
normal-and-dark training data. The second one is the subtle
design of the enhancement model. For example, Ren et al. [23] where  is a small positive constant (empirically set as
design a hybrid model, in which a Recurrent Neural Network  = 0.001) to avoid zero denominator. Eq. (3) decom-
(RNN) is used to model high-frequency edges, while an auto- poses the GTV regularization approximation into a quadratic
2
encoder is used to model low-frequency image contents. In term k∇x,y T k2 and a non-linear weight defined as ωx,y =
most current methods, their performance still highly depends 1
(∇x,y Gσ (T ))2
. Then we can rewrite the
max(|∇x,y T |,)exp( 2 )
on the pairwise training datasets, which is not easy to collect at 2
2σ1

a large scale. To address this issue, Jiang et al. [24] proposed global optimization function as:
a highly effective unsupervised generative adversarial network 2 2 2
argmin kT − Sk2 + λ(ωx k∇x T k2 + ωy k∇y T k2 ), (4)
to solve this problem. We also note that many kinds of T
information can be used to build a learning-based low-light
enhancement model, such as the social information [25], user which can be reformulated into the matrix form as:
interaction [26].
argmin(T − S)T (T − S)
T

III. E DGE -P RESERVING F ILTER BASED ON G AUSSIAN +λ(TT DT T T


x Wx Dx T + T Dy Wy Dy T). (5)
T OTAL VARIATION
Here T and S are the matrix representations of T and S,
We assume that an illumination layer I is sufficiently respectively. Dx and Dy are the Toeplitz matrices from the
piecewise-smooth for Retinex decomposition. Based on this discrete gradient operators with forward difference. Wx and
assumption, I can be directly obtained by imposing an edge- Wy are diagonal matrices containing the weights wx and wy .
preserving filter on the input image S. In this section, we By setting the derivative of Eq. (5) with respect to T be equal
propose a novel edge-preserving filter regularized by Gaussian to 0, we have
Total Variation (GTV) for the task of Retinex decomposition.
(T − S) + λ(DT T
x Wx Dx T + Dy Wy Dy T) = 0. (6)

A. Gaussian Total Variation T can be analytically solved as:

First, we extend the traditional total variation (TV) [27] T = (1 + λL)−1 S, (7)
|∇x,y T | into Gaussian Total Variation (GTV):
where 1 is the identity matrix with the same size as S, and L =
(∇x,y Gσ2 (T ))2 DT T
x Wx Dx + Dy Wy Dy is a sparse five-point definite Laplacian
∇x,y T /exp( ) , (1) matrix [30]. Based on analytical solution in Eq. (7), we can
2σ12
design an iterative filter to produce a piecewise smoothing
where T is the filtered image, and ∇x,y T is differential of T result Tk :
in the x or y direction. We add a Gaussian kernel term as the
denominator, with its kernel width as σ1 . In the kernel, prior Tk = (1 + λLk−1 )−1 S. (8)
to the differential in x or y direction, we also apply a Gaussian
filter Gσ2 (·) on T , with σ2 as the spatial width. For solving Eq. (8) efficiently, we compute the inverse with
By applying the GTV as the regularization term, we build the help of the preconditioned conjugate gradient (PCG) [35]
our filtering model as the following target function: technique, where the complexity is reduced to O(N ).
In Fig. 1, we compare our method with several state-of-
2 (∇x Gσ2 (T ))2 the-art edge-preserving filters, such as TV [27], GF [28], BLF
argmin kT − Sk2 + λ( ∇x T /exp( )
T 2σ12 1 [29], WLS [30], RTV [31], ROG [32], muGIF [33], SDTS
(∇y Gσ2 (T ))2 [34] and LVD [4]. We initialize T0 = S in Eq. (8), and set
+ ∇y T /exp( ) ), (2) the parameter k, λ, σ12 and σ2 as k = 3, λ = 0.1 ∗ 100,
2σ12 1
σ12 = 0.0005, σ2 = 5. Different edge-preserving filters are
where S is the original image, and λ is the balancing compared on a noisy image and a line of pixels extracted
parameter. The first term is an L2-norm fidelity term that from it. In can be observed that, after equipping with GTV
guarantees the overall similarity between the filtered image regularization, our filter performs better in removing texture
and the original image. details while keeping the basic structure undistorted.

1520-9210 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Glasgow. Downloaded on June 02,2020 at 11:36:10 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMM.2020.2969790, IEEE
Transactions on Multimedia
IEEE TRANSACTIONS ON MULTIMEDIA 4

(b) (c) (d) (e) (f)

(a)

(g) (h) (i) (j) (k)

Fig. 1. Compare different edge/structure-preserving image smoothing on a noisy image. (a) Input. (b) TV [27] (k = 100). (c) GF [28] (r = 8,  = 0.004).
(d) BLF [29] (σs = 12, σr = 0.2). (e) WLS [30] (λ = 1, α = 1.2). (f) RTV [31] (λ = 0.008, σ = 3, t = 4). (g) ROG [32] (k = 3,λ = 0.01, σ1 = 1,
σ2 = 3). (h) muGIF [33] (αt = 0.008, k = 10). (i) SDTS [34] (α = 1.5, th = 0.001, k = 10). (j) LVD [4] (α = 0.001, r = 3, k = 3). (k) GTV (k = 3,
λ = 0.1 ∗ 100, σ12 = 0.0005, σ2 = 5)

(a) (b) (c) (d)

(a) (b) (c)


(e) (f) (g) (h)

Fig. 3. Effect of Gσ2 (·). This figure demonstrates the decomposition of the
Retinex model proposed with Eq. (21) and (23). (a) Input. (b) Illumination
map (w/o Gσ2 (·)). (c) Reflectance map (w/o Gσ2 (·)). (d) Enhanced image
(w/o Gσ2 (·)), σ12 = 0.001. (e) Zoomed-in patches of (b-d) and (f-h). (f)
Illumination map. (g) Reflectance map. (h) Enhanced image, σ12 = 0.001
and σ2 = 10.

(d) (e) (f)

Fig. 2. Effect of our variation measures. (a) Input. (b) |∇x T | +


(∇x Gσ2 (T ))2 (∇y Gσ2 (T ))2 (a) (b) (c) (d)
|∇y T |. (c) ∇x T /exp( 2
2σ1
) + ∇y T /exp( 2
2σ1
) . (d)
(∇x Gσ2 (T ))2 (∇y Gσ2 (T ))2
1/exp( 2
2σ1
) + 1/exp( 2
2σ1
) . (e) Result of (b), k = 3,
λ = 0.1 ∗ 100. (f) Result of (c), k = 3, λ = 0.1 ∗ 100, σ12 = 0.01, σ2 = 5.
(e) (f) (g) (h)

B. Analysis of GTV Fig. 4. Results by using TV and GTV as the regularization term in our
model. (a) Input. (b) Illumination map. (c) Reflectance map. (d) Enhanced
We further analyze the proposed GTV by decomposing image. (b-c) are based on TV regularization. (e) Zoomed-in patches of (b-d)
the GTV regularization into two parts, i.e. |∇x T | + |∇y T | and (f-h). (f) Illumination map. (g) Reflectance map. (h) Enhanced image.
(∇ Gσ2 (T ))2 (∇ Gσ2 (T ))2 (f-h) are based on our GTV regularization, σ12 = 0.01, σ2 = 5.
and 1/exp( x 2σ 2 ) + 1/exp( y 2σ 2 ) . Taking an
1 1

1520-9210 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Glasgow. Downloaded on June 02,2020 at 11:36:10 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMM.2020.2969790, IEEE
Transactions on Multimedia
IEEE TRANSACTIONS ON MULTIMEDIA 5

image with rich textures (Fig. 2(a)) as an example, these of using this modified Retinex model by comparing it with
two terms can be visualized in Fig. 2(b) and Fig. 2(d). S = I ◦ R + N proposed in [6]. On the one hand, since I
Although |∇x T | + |∇y T | contains entire gradients includ- is assumed be sufficiently piecewise-smooth, it should be free
ing fine and salient structures, this term cannot well dis- of the small scale imaging noise, and the imaging noise is
tinguish finer structures from salient ones. For instance, in supposed to exist in the other decomposed layer, i.e., R +
Fig. 2(b), the gradients of head decoration in the sculpture N . Therefore, we only need to impose the denoising task on
(finer structures) are as strong as the object boundary (salient the decomposed R + N . On the other hand, the process of
structures). By only encoding TV in the filter, the filtering denoising in [6] involves I, R, and N simultaneously, which
result has a less-effective edge-preserving performance, e.g. tends to increases the difficulty of the optimization process.
Fig. 2(e). To solve this issue, we also consider the second Based on our modeling, the enhancement is implemented by
(∇ Gσ2 (T ))2 (∇ Gσ2 (T ))2
term 1/exp( x 2σ 2 ) + 1/exp( y 2σ 2 ) . As shown the following steps. First, for an input image S, we obtain the
1 1
in Fig. 2(d), this term highlights the positions of salient estimated illumination layer I and the estimated reflectance
structures against other structures in this image. Moreover, layer R. Then, we apply the Gamma correction on I, and
we obtain the GTV by combining these two terms together. recompose the two layers as S 0 = I γ ◦ R for obtaining the
From Fig. 2(c), we can observe that the GTV-based map enhanced image. Here, we assume that S is a 2D gray image.
concentrates mainly on the fine structures. Therefore, the GTV- To process color images, the above framework can be repeated
based regularization performs edge-preserving filtering much in a channel-wise style, by regarding S as each channel of an
better. For example, Fig. 2(f) based on our GTV regularization RGB image. The key step of the above process is the image
shows a better piecewise smooth representation than Fig. 2(e) decomposition described in the following.
based on traditional TV. First, the illumination layer I is estimated based on the
Of note, we impose a Gaussian filter Gσ2 (·) on T in our GTV filter with S at hand. Then, based on S and the filtered
GTV, which adjusts the edge sharpness of a filtered image. I, we estimate the reflectance layer R, and conduct denoising
This is important for the low-light enhancement task based simultaneously.
on Retinex, since it can effectively prevent the enhanced Specifically, for estimating I, we build the following target
results from generating the edge halo artifact. The reason function based on GTV filter:
for generating this artifact can be explained as below. Under
the assumption of S = I ◦ R, an almost ideally sharp edge 2 ∇x I
boundary in illumination map (e.g. Fig. 3(b)) can introduce argmin I − Iˆ + α( (∇x Gσ2 (I))2
I 2 exp( )
numerical instability into reflectance map (e.g. Fig. 3(c)), 2σ12 1
and thus generates artifact in the enhanced image (e.g. Fig.
3(d)). As for GTV, the term Gσ2 (·) harmonizes the sharp edge ∇y I
+ (∇y Gσ2 (I))2
), (10)
at a very small scale (e.g. Fig. 3(f)), and then relieves the exp( 2 )
2σ1 1
instability issue in reflectance map (e.g. Fig. 3(g)), which in
turn guarantees the visual quality of the enhanced image (e.g. 2
Fig. 3(h)). where I − Iˆ represents the fidelity between the initial
In addition, we also compare the results obtained by using 2
illumination Iˆ and the refined result I, and α is the balancing
GTV and TV as regularization terms in our low-light enhance-
parameter for the GTV regularization item. To generate initial
ment model. In Fig. 4, compared with the TV-based regulariza- ˆ we use maxRGB representation of the
illumination map I,
tion, the GTV-based regularization maintains the structure of ˆ
image as in [8], [18]. For each pixel p, we have I(p) =
the illumination map and performs edge-preserving filtering
max Sc (p) for reflecting the illumination of the imaging
much better. Based on this, in the Retinex model decom- c∈{R,G,B}
position, the GTV-based regularization produces more clear scene to some extent.
reflectance map structure than the TV-based regularization Based on the estimated I and S, we further build the target
does. Therefore, for the final enhanced images, the result based function for estimating R:
on GTV regularization have better visual effects.
2 ∇x R
IV. S EMI -D ECOUPLED R ETINEX D ECOMPOSITION FOR argmin kI ◦ R − Sk2 + β( (∇x Gσ2 (R))2
L OW-L IGHT I MAGE E NHANCEMENT R exp( 2σ12
)
1
A. Semi-decoupled Retinex Decomposition
∇y R
Considering the imaging noise during the enhancement, the + (∇y Gσ2 (R))2
), (11)
exp( 2σ12
)
traditional Retinex model can be modified as: 1

S = I ◦ (R + N ), (9) 2
where kI ◦ R − Sk2 is the fidelity term that measures the
where R is the noise-free reflectance layer, N is the imaging similarity bewteen S and I ◦ R, and β is a very small positive
noise, and R + N is the observed reflectance layer. In general, parameter weighing the GTV regularization on R. Our main
the multiplication relationship of the traditional Retinex model goal is to keep the consistency between I and R for the
is well kept in Eq. (9). In addition, we present our motivation main structures. Considering the imaging noise influence, we

1520-9210 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Glasgow. Downloaded on June 02,2020 at 11:36:10 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMM.2020.2969790, IEEE
Transactions on Multimedia
IEEE TRANSACTIONS ON MULTIMEDIA 6

hereby add a TV-denoising term k∇x Rk1 + k∇y Rk1 in the Algorithm 1 Low-Light Image Enhancement with Semi-
R-estimation function as below: Decoupled Decomposition
Input: input image S, parameters α, β, δ, , σ1 and σ2 ,
2 ∇x R maximum iterations K and stopping parameters ε.
argmin kI ◦ R − Sk2 + β(
R exp(
(∇x Gσ2 (R))2
) Output: illumination I, reflectance R and enhanced image S0 .
2σ12 1

∇y R 1: initialize Î = max Sc (p) and I0 ← B = F(Î)


+ (∇y Gσ2 (R))2
) + δ(k∇x Rk1 + k∇y Rk1 ), (12) c∈{R,G,B}
exp( 2
2σ1
) 2: for k = 1 → K do
1
3: compute weights ωx,y using Eq. (16)
where δ is the positive parameter. With the help of δ, our 4: update Ik using Eq. (23)
method is able to handle different illumination conditions. 5: if k = 1 then
For example, in real-world applications, we can simply set 6: R0 = S/I1
δ = 0 to handle the back light condition in the daytime (with 7: end if
less imaging noise), and set δ as a non-zero value to handle 8: compute weights ux,y using Eq. (16) and qx,y using Eq.
the weak light in the nighttime (with considerable imaging (17)
noise). Obviously, by setting δ = 0, Eq. (12) degenerates to 9: update Rk using Eq. (21)
Eq. (11), and the modified Retinex model degenerates into the 10: if kIk − Ik−1 k / kIk−1 k < ε or
traditional Retinex model. kRk − Rk−1 k / kRk−1 k < ε or k > K then
Of note, the above I and R estimation is semi-decoupled. 11: break
That means, the estimation of I is not involved with R, while 12: end if
the estimation of R involves I. Furthermore, the denoising 13: end for
1
process is only implemented during the R-estimation. 14: S0 = I 2.2 ◦ R

B. Solution
Similar to the GTV filtering process, we can approximate
the target function of Eq. (10) into a convex form: argmin(I ◦ R−S)T (I ◦ R − S)
R
2
2 2
argmin I − Iˆ + α(ωx k∇x Ik2 + ωy k∇y Ik2 ). (13) + β(RT DT T T
x Ux Dx R + R Dy Uy Dy R)
I 2
+ δ(RT DT T T
x Qx Dx R + R Dy Qy Dy R), (19)
Then, the R-estimation in Eq. (12) can be approximated as:
2 2 2 where I, Î, R and S are the matrix forms of I, I, ˆ R and
argmin kI ◦ R − Sk2 + β(ux k∇x Rk2 + uy k∇y Rk2 )
R S, Dx,y is the Toeplitz matrices from the discrete gradient
2
+ δ(qx k∇x Rk2 + qy k∇y Rk2 ). (14)
2 operators with forward difference, and Wx,y , Ux,y and Qx,y
are diagonal matrices containing the weights wx,y , ux,y and
The above approximation of L1-norms in Eq. 10 are all based qx,y . The functions in Eq. (18) and Eq. (19) are convex, which
on: can be analytically solved. In this way, we can design an
(ζ)2 iterative framework to gradually obtain the I and R layer. For
kζk1 = the k-th iteration, Ik and Rk can be obtained as:
ζ 1
1 2 Ik = (1 + αMk−1 )−1 Î, (20)
≈ kζk2 , (15)
max(|ζ| , )
−1 T
where ζ can be ∇x,y I or ∇x,y R in our application, and  is Rk = (IT
k Ik + βNk−1 + δZk−1 ) (Ik S), (21)
a small positive constant to avoid zero denominator, which is
where Mk−1 = DT T
x Wxk−1 Dx + Dy Wyk−1 Dy , Nk−1 =
empirically set as 0.001. The weights in Eq. (13) and Eq. (14)
Dx Uxk−1 Dx + Dy Uyk−1 Dy , and Zk−1 = DT
T T
x Qxk−1 Dx +
are: T
Dyk−1 Qy Dy . The iterative refinement of I and R stops until
1
ωx,y , ux,y = (∇ Gσ2 (I,R))2
, any of these two conditions satisfies: kIkI k −Ik−1 k
k−1 k
< ε or
max(|∇x,y (I, R)| , )exp( x,y 2σ 2 ) kRk −Rk−1 k
1
kRk−1 k < ε. Here ε is the small threshold controlling the
(16)
iterations.
Speed-up strategy. As for the initialization, I0 can be set
1
qx,y = . (17) as Î, and R0 can be set as S/I1 . To further speed up iterative
max(|∇x,y R| , ) process, we conduct a pre-filtering on Î with our GTV filter
Then, we reformulate Eq. (13) and Eq. (14) into the matrix F(·), and then use B = F(Î) to replace Î in Eq. (18) and Eq.
form: (20). In this way, we have
argmin(I − Î)T (I − Î) argmin(I − B)T (I − B)
I I
+α(IT DT T T
x Wx Dx I + I Dy Wy Dy I), (18) +α(IT DT T T
x Wx Dx I + I Dy Wy Dy I), (22)

1520-9210 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Glasgow. Downloaded on June 02,2020 at 11:36:10 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMM.2020.2969790, IEEE
Transactions on Multimedia
IEEE TRANSACTIONS ON MULTIMEDIA 7

Ik = (1 + αMk−1 )−1 B. (23) Then, the fast convergence of our iterative decomposi-
tion is empirically validated. In Fig. 6, we compare the
From Eq. (22), we can see that the original anchor Î is iterative error εI = kIk − Ik−1 k / kIk−1 k (red), εR =
replaced by B, which resembles our requirement of piecewise- kRk − Rk−1 k / kRk−1 k (blue), illumination (1st row), re-
smoothness I. The quick convergence of this iterative process flectance (2nd row), and final enhancement image (3rd row)
can be validated in the experimental section. The whole low- under different iterations. From the iterative error curves
light image enhancement process is summarized in Algorithm shown in the 4th row of Fig. 6(a) and Fig. 6(b), we can
1. see that the later one has a fast convergence while keeping
the same visual quality of enhancement results (the 3rd row
V. E XPERIMENTAL R ESULTS AND A NALYSIS of Fig. 6(a) and Fig. 6(b)). Compared with JieP [4] (the 4th
In this section, we conduct qualitative and quantitative row of Fig. 6(c)), our method without the speed-up strategy
experiments to evaluate our low-light image enhancement has a lightly faster convergence speed (the 4th row of Fig.
method. Our method can be simply set as two versions, i.e. 6(a)). As for our method with the speed-up strategy, it shows
the non-denoising version (δ = 0, we call them as OurI) and a much faster convergence speed (the 4th row of Fig. 6(b)),
the denoising version (δ 6= 0, we call them as OurII). due to the similarity between the fidelity anchor B in Eq. (23)
For the parameters setting, K, α, β, ε, , σ1 and σ2 are set as and our desired illumination I. By comparing Fig. 6(a) and
20, 0.01, 0.0001, 0.01, 0.0001, 1 and 5 in both OurI and OurII. Fig. 6(b) with Fig. 6(c), it is demonstrated that our method
In addition, we generate B to accelerate the decomposition by performs comparable or better in terms of the visual quality
pre-filtering Î. For Eq. (8), we set k = 1, λ = 0.05 ∗ 100, of the decomposed layers.
σ1 = 1 and σ2 = 5,  = 0.001. We run all the codes on
a PC with Windows 10 OS, 8G RAM and 3.4GHz CPU. B. Comparison with Other Methods
The experimental data come from four public datasets, i.e. 1) Qualitative Comparison: Fig. 7, Fig. 8, and Fig. 9
DICM [36], LIME [8], Fusion [37] and VV 1 , processing demonstrate the visual comparisons between OurI and the
44+10+18+24=96 images in total. methods in Group 1. We have the following observations.
We choose nine state-of-the-art methods for comparison, First, the results based on DeHz [7] and Retinex-net [21]
including JieP [4], HQEC [5], LIME [8], DeHz [7], MF have strong artifacts such as unrealistic edges and strongly
[10], Retinex-Net [21], SICE [20], LIME+BM3D [8], [12], boosted noise. Second, LIME [8] and SICE [20] tend to over-
RRM [6]. These methods are divided into two groups. Group enhance the images, which also strongly-boosted the imaging
1 includes the first seven methods, which do not explicitly noise [8] or lose the details of the originally-bright regions
consider the imaging noise in their modeling process. Among [20]. Third, among all the eight methods for comparison, the
Group 1, [4] is based on traditional Retinex model, [5], [7], results based on JieP [4], HQEC [5], MF [10] and OurI all
[8] are based on simplified Retinex model, [10] is based on achieve acceptable visual quality. By taking a closer look
the multiple fusion, and [20], [21] are deep-learning-based in Fig. 7, OurI better restores the palette’s color from the
methods. We compare OurI with these seven methods. Group relatively dark scene. In additon, as shown in Fig. 8, OurI
2 includes the last two of the nine methods, which explicitly better suppresses the imaging noise than HQEC [5], which
consider the imaging noise. [8] uses BM3D as the post- empirically validates the usefulness of full Retinex model.
denoising step, and [6] directly models the imaging noise By comparing with MF [10], Fig. 9 shows that our method
into their improved Retinex model. We compare OurII with better keeps the naturalness of the sky region, while effectively
these two methods in our experiments. The implementation enhancing the building region. Of note, the performance of
codes for these nine methods were downloaded from their SICE [20] and Retinex-net [21] is not satisfying. On the one
project sites, and the codes of our method is available at hand, they are both data-driven models that can be suffered
https://github.com/hanxuhfut/Code. from insufficient pairwise training data. As for the low-light
enhancement task, the collected dataset either in [20] or [21]
A. Analysis of Our Decomposition contains less than one thousand image pairs, which is far from
First, we validate effectiveness of our decomposition. In Fig. covering the various low-light conditions of the numerous real-
5, we show the results based our decomposition model and world scenes. On the other hand, the loss functions also needs
other models [4], [6], [21], which also obtain the illumination a careful design to adapt the specific need of the low-light
layer and the reflectance layer for low-light enhancement. enhancement task.
From this figure, we have the following observation. The In Fig. 10, we compare OurII with the methods in
illumination layer based on JieP [4] does not well separate Group 2 to validate our noise suppression performance. The
the illumination of distant road and buildings. As for RRM LIME+BM3D [8], [12] simply considers the noise suppression
[6] and Retinex-Net [21], they both introduce strong artifacts as a post-denoising step. Therefore, its results in Fig. 10(b) is
for the decomposed reflectance layers, such as edge reversal inevitably over-enhanced, and introduces strong noise. As for
and imaging noise. In contrary, two versions of our method the results of RRM [6] (Fig. 10(c)), the naturalness of the
both perform well in estimating the illumination layer and the overall scene illumination improves, and the imaging noise is
reflectance layer. effectively suppressed at the same time. However, the strong
edge artifact can be introduced for processing dark regions,
1 https://sites.google.com/site/vonikakis/datasets which is inherited from the estimated R layer (e.g. Fig. 5(c)).

1520-9210 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Glasgow. Downloaded on June 02,2020 at 11:36:10 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMM.2020.2969790, IEEE
Transactions on Multimedia
IEEE TRANSACTIONS ON MULTIMEDIA 8

(a)

(b) (c) (d) (e) (f)

Fig. 5. Comparison of several Retinex decomposition results. (a) Input. (b) JieP [4]. (c) RRM [6]. (d) Retinex-Net [21]. (e) OurI. (f) OurII. The illumination
maps and the reflectance maps are shown in the first row and the second row, respectively.

=2 =5 = 10 =2 =5 = 10 =2 =5 = 10
Iterative Error

Iterative Error

Iterative Error

Iterations Iterations Iterations

(a) (b) (c)

Fig. 6. An example of model convergence results. The original image as show in Fig. 5 (a). (a-c) From top to bottom: illumination maps, reflectance maps,
final enhanced images, and iterative error curves (εI is red curve, εR is blue curve) obtained based on our method without the speed-up strategy, with the
speed-up strategy, and the method of JieP [4].

In Fig. 11, we demonstrate that OurII can better preserve 2) Quantitative Comparison: We compare our method
the image contents in originally bright regions. The enhanced with others by using non-reference image quality assess-
results in each row of Fig. 11(b) and Fig. 11(c) are based ment metrics and subjective rating. The evaluation met-
on LIME+BM3D [8], [12], RRM [6], and OurII, respectively. rics include Natural Image Quality Evaluator (NIQE) [38],
Fig. 11(d) and Fig. 11(e) show the residuals (basically noise AutoRegressive-based Image Sharpness Metric (ARISM) [39]
and small textures) and their 1-D profiles. We observe that and No-reference Free Energy based Robust Metric (NFERM)
OurII achieve the lowest residuals in the lamp region. In [40]. Among them, we adopt two versions of ARISM. One
our application, the originally bright regions with very large version evaluates the luminance only, and the other version
SNRs require no denoising. Therefore, OurII better satisfies evaluates both luminance and chromatic components. For all
this demand by keeping these regions unchanged as much as four metrics, small values indicate better performance.
possible.
We show the quantitative scores for the methods of Group

1520-9210 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Glasgow. Downloaded on June 02,2020 at 11:36:10 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMM.2020.2969790, IEEE
Transactions on Multimedia
IEEE TRANSACTIONS ON MULTIMEDIA 9

(b) (c) (d) (e)

(a)

(f) (g) (h) (i)

Fig. 7. Comparison with state-of-the-art low-light image enhancement methods. (a) Input. (b) MF [10]. (c) DeHz [7]. (d) LIME [8]. (e) HQEC [5]. (f)
Retinex-Net [21]. (g) SICE [20]. (h) JieP [4]. (i) OurI.

(b) (c) (d) (e)

(a)

(f) (g) (h) (i)

Fig. 8. Comparison with state-of-the-art low-light image enhancement methods. (a) Input. (b) MF [10]. (c) DeHz [7]. (d) LIME [8]. (e) HQEC [5]. (f)
Retinex-Net [21]. (g) SICE [20]. (h) JieP [4]. (i) OurI.

(b) (c) (d) (e)

(a)

(f) (g) (h) (i)

Fig. 9. Comparison with state-of-the-art low-light image enhancement methods. (a) Input. (b) MF [10]. (c) DeHz [7]. (d) LIME [8]. (e) HQEC [5]. (f)
Retinex-Net [21]. (g) SICE [20]. (h) JieP [4]. (i) OurI.

1520-9210 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Glasgow. Downloaded on June 02,2020 at 11:36:10 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMM.2020.2969790, IEEE
Transactions on Multimedia
IEEE TRANSACTIONS ON MULTIMEDIA 10

(a) (b) (c) (d)

Fig. 10. Comparison with state-of-the-art low-light image enhancement and denoising methods. (a) Input. (b) LIME+BM3D [8], [12]. (c) RRM [6]. (d) OurII.

1 and OurI from Table I to Table IV. The observations and looks natural. Two rounds of experiments were conducted for
their analysis are as follows. Generally, for all four datasets, each volunteer. One is for Group 1 v.s. OurI, and the other
OurI, JieP [4] and MF [10] gain better performances than other is for Group 2 v.s. OurII. The two billboards are shown in
five methods. Among the top three methods, OurI and JieP Fig. 12 and Fig. 13, respectively. It is worth noting that the
are comparable to each other. In Table I, JieP achieves the rating software we developed randomly arranges the order of
best performance in terms of NIQE, while OurI obtains the the enhanced images for each new-coming test image. Based
best performances of the other three metrics. Of note, MF on Fig. 12, we can see that OurI achieves the best on all
only fuses several intermediate results based on some simple the tests. Specifically, OurI, JieP and MF still occupy the top
enhancing techniques, which demonstrate the usefulness of the 3 performances, and OurI is on par or better than JieP. As
fusion roadmap. The success of OurI and JieP demonstrates shown in Fig. 13, OurII obtains much better performances
that the traditional Retinex model is very competitive in the than the other two competitors. Generally, the results of these
low-light image enhancement task, even compared with the subjective ratings is consistent with the results based on the
deep-learning based methods. We also show the quantitative above non-reference evaluation metrics.
scores for the methods of Group 2 and OurII in Table V and
Table VI. Similarly, OurII also obtains the best performances C. Other Application
on most datasets under three metrics, which demonstrates the In addition to low-light image enhancement, we also applied
effectiveness of our modeling for noise surppression. our model on the task of image dehazing, which also achieves
Additionally, we conduct a subjective rating to further good results. Inspired by [7] and [8], we can consider the
evaluate our method. 20 volunteers aging from 20 to 35 were inverted haze image 1 − H as a low-light image S. Therefore,
asked to rate each of the processed images of our method and the dehazed image can be obtained by sending 1 − H into our
the competitors with an overall score. The five-grade rating model, and inverting the output image again. Fig. 14 shows
rule were adopted, in which 1 means the worst and 5 means the the process.
best. The evaluation criteria include 1) whether the lightness Fig. 15 presents some comparisons of haze removal results
condition has been improved, 2) whether the result image based on our model and the dehazing method based on

1520-9210 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Glasgow. Downloaded on June 02,2020 at 11:36:10 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMM.2020.2969790, IEEE
Transactions on Multimedia
IEEE TRANSACTIONS ON MULTIMEDIA 11

Amplitude
Pixel Position

Amplitude
Pixel Position
(a)

Amplitude
Pixel Position
(b) (c) (d) (e)

Fig. 11. Comparison of different methods of denoising. (a) Input. (b) Noise-containing Image. (c) Denoised Image. (d) Residuals between (b) and (c). (e)
1-D Profile Signal. The top row shows the results of LIME [8], the middle row shows the results of RRM [6], and the bottom row shows the results of OurII.

TABLE I
C OMPARISON OF AVERAGE NIQE [38] ON F OUR DATASETS

Methods MF [10] DeHz [7] LIME [8] HQEC [5] Retinex-Net [21] SICE [20] JieP [4] OurI
Metric NIQE
DICM 3.4533 4.3412 3.7023 3.6758 4.7121 4.0834 2.9067 2.9341
LIME 4.1025 4.2465 4.2729 4.2784 4.9079 4.2229 3.4414 3.6581
VV 2.4371 2.9045 2.8918 2.5375 3.2440 3.0799 2.3669 2.5138
Fusion 2.8842 3.5248 3.3068 2.9474 3.7337 3.8690 2.8850 3.0372
Average 3.1601 3.8191 3.4849 3.3174 4.1867 3.8068 2.8234 2.9238

TABLE II
C OMPARISON OF AVERAGE ARISM [39] ( THE FIRST VERSION ) ON F OUR DATASETS

Methods MF [10] DeHz [7] LIME [8] HQEC [5] Retinex-Net [21] SICE [20] JieP [4] OurI
Metric ARISM(Luminance component only)
DICM 3.0860 3.2078 3.2230 3.3874 3.0383 3.3769 2.9605 2.9321
LIME 2.8673 2.9451 3.0080 3.0071 3.1879 3.2720 2.8034 2.7648
VV 2.8228 2.8887 3.1383 2.9830 2.9828 3.3864 2.7890 2.8209
Fusion 2.7600 2.7957 2.9458 2.8474 2.8258 3.1125 2.7512 2.7636
Average 2.9363 3.0234 3.1274 3.1454 3.0020 3.3188 2.8620 2.8583

TABLE III
C OMPARISON OF AVERAGE ARISM [39] ( THE SECOND VERSION ) ON F OUR DATASETS

Methods MF [10] DeHz [7] LIME [8] HQEC [5] Retinex-Net [21] SICE [20] JieP [4] OurI
Metric ARISM(Luminance and chromatic components)
DICM 3.3164 3.4479 3.4868 3.6726 3.3883 3.6992 3.2095 3.1831
LIME 3.1213 3.1971 3.2859 3.2580 3.4579 3.5552 3.0494 3.0107
VV 3.0566 3.1236 3.4110 3.2298 3.2440 3.7157 3.0309 3.0787
Fusion 3.0255 3.0680 3.2423 3.1264 3.1395 3.4747 3.0152 3.0345
Average 3.1766 3.2695 3.4011 3.4163 3.3147 3.6462 3.1118 3.1112

1520-9210 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Glasgow. Downloaded on June 02,2020 at 11:36:10 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMM.2020.2969790, IEEE
Transactions on Multimedia
IEEE TRANSACTIONS ON MULTIMEDIA 12

TABLE IV
C OMPARISON OF AVERAGE NFERM [40] ON F OUR DATASETS

Methods MF [10] DeHz [7] LIME [8] HQEC [5] Retinex-Net [21] SICE [20] JieP [4] OurI
Metric NFERM
DICM 15.5248 22.6046 17.8705 18.4524 19.3010 17.3339 4.5012 3.7930
LIME 13.3566 12.5028 15.6323 13.7183 22.0788 11.0754 18.4181 7.8934
VV 13.7342 16.9443 16.8710 13.4168 23.2519 14.5109 11.3966 9.1816
Fusion 18.1297 6.8282 5.0819 4.7154 6.0992 2.5557 17.0225 17.1119
Average 15.3397 17.1792 14.9896 14.1247 18.2291 13.2053 10.0225 8.0600

TABLE V
C OMPARISON OF AVERAGE NIQE [38] AND NFERM [40] ON F OUR DATASETS

Methods LIME+BM3D [8], [12] RRM [6] OurII LIME+BM3D [8], [12] RRM [6] OurII
Metric NIQE NFERM
DICM 2.8592 3.3186 3.0161 14.1868 11.5682 6.8160
LIME 4.1842 4.1080 3.7459 31.2363 22.5246 18.2021
VV 2.4030 2.7928 2.7328 17.8949 12.9621 8.6622
Fusion 3.2426 3.5877 3.1808 27.8328 19.9871 19.0156
Average 2.9551 3.3056 3.0522 19.4484 14.4366 10.7510

TABLE VI
C OMPARISON OF AVERAGE ARISM [39] ( BOTH TWO VERSIONS ) ON F OUR DATASETS

Methods LIME+BM3D [8], [12] RRM [6] OurII LIME+BM3D [8], [12] RRM [6] OurII
Metric ARISM(Luminance component only) ARISM(Luminance and chromatic components)
DICM 2.8157 2.6968 2.6762 3.1376 2.9452 2.9197
LIME 2.6421 2.6528 2.6571 2.9471 2.8953 2.8923
VV 2.8613 2.7943 2.7801 3.1576 3.0616 3.0297
Fusion 2.7291 2.7534 2.6887 3.0426 3.0256 2.9466
Average 2.7928 2.7274 2.7025 3.1049 2.9842 2.9494

MF DeHz LIME HQEC Retinex-Net SICE JieP OurⅠ LIME+BM3D RRM OurⅡ

4.3600
4.3350

4.2646

4.2292
4.2284
4.2057

4.1760
4.1625
4.1604
4.1583
4.1472
4.1050

4.1111
4.0880
4.0807

4.0167

4.0115
3.9667

3.9438
3.9389
3.9295

3.9250

3.8500

3.8489

3.8450
3.8380

3.7672
3.7639
3.6917
3.6477

3.5896
3.5875
3.5083

3.4365
3.3792

3.3600
3.3114
3.2167
3.1900
3.1159

3.1150

2.9667

2.9229
2.8886

2.8104
2.3639
2.3550

2.3188

2.2917
2.2466

1.7139
1.5500

1.5094
1.4813
1.4318

DICM LIME Fusion VV Average DICM LIME Fusion VV Average

Fig. 12. Average user study scores on four datasets (Group1 v.s. OurI). Fig. 13. Average user study scores on four datasets (Group2 v.s. OurII).

dark channel prior [41]. According to [41], we set the aerial experiments on four public datasets validate the effectiveness
perspective parameter as 0.95 and the patch size as 15 × 15. of our method.
As shown in Fig. 15, the results of [41] have strong artifacts,
There also exist some limitations of our method. First, we
while our method removes most of the haze and keeps the
only consider the imaging noise in the proposed method. For
natural visual appearance.
storing and transferring digital images, compression algorithm
imposed on an image can produce JPEG artifacts hidden in the
VI. C ONCLUSION AND D ISCUSSIONS originally dark regions. Removing JPEG artifact while keeping
In this paper, we propose a low-light image enhancement the naturalness of the image is challenging, especially when
method based on the semi-decoupled Retinex decomposition. the spatial scale of JPEG blocks is as large as that of the
During the decomposition process, the illumination layer is image contents. The divide-and-conquer strategy applied in
individually estimated based on the proposed Gaussian Total [42], [43] is a possible solution to handle this type of artifact
Variation filter, while the reflectance layer is jointly estimated in the low-light enhancement task. Second, our method is
based on the Retinex constraint. Our low-light enhancement not formulated as an aesthetics-driven enhancement model.
model can be easily adjusted to tackle low-light images with It is known that many beautiful images contain dark regions
different imaging noise levels. Qualitative and quantitative for visual impact or emotion expression. Our current method

1520-9210 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Glasgow. Downloaded on June 02,2020 at 11:36:10 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMM.2020.2969790, IEEE
Transactions on Multimedia
IEEE TRANSACTIONS ON MULTIMEDIA 13

[8] X. Guo, Y. Li, and H. Ling, “LIME: Low-light image enhancement


via illumination map estimation.” IEEE Trans. Image Process., vol. 26,
no. 2, pp. 982–993, Feb. 2017.
[9] C. Lee, C. Lee, and C.-S. Kim, “Contrast enhancement based on
layered difference representation of 2D histograms,” IEEE Trans. Image
Process., vol. 22, no. 12, pp. 5372–5384, Dec. 2013.
[10] X. Fu, D. Zeng, Y. Huang, Y. Liao, X. Ding, and J. Paisley, “A
(a) (b) fusion-based enhancing method for weakly illuminated images,” Signal
Process., vol. 129, pp. 82–96, Dec. 2016.
[11] Y. Gao, H.-M. Hu, B. Li, and Q. Guo, “Naturalness preserved nonuni-
form illumination estimation for image enhancement based on retinex,”
IEEE Trans. Multimedia, vol. 20, no. 2, pp. 335–344, Feb. 2018.
[12] K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian, “Image denoising
by sparse 3-D transform-domain collaborative filtering,” IEEE Trans.
Image Process., vol. 16, no. 8, pp. 2080–2095, Aug. 2007.
[13] X. Ren, M. Li, W.-H. Cheng, and J. Liu, “Joint enhancement and
(c) (d)
denoising method via sequential decomposition,” in Proc. IEEE Int.
Symp. Circuits Syst., 2018, pp. 1–5.
Fig. 14. Haze removal example by our model. (a) Haze image H. (b) Inverted [14] W. Zhao, H. Lu, and D. Wang, “Multisensor image fusion and en-
haze image 1 − H. (c) Enhanced result by our model. (d) Haze removal result hancement in spectral total variation domain,” IEEE Trans. Multimedia,
(inverted image of (c)). vol. 20, no. 4, pp. 866–879, Apr. 2018.
[15] F. Kou, Z. Wei, W. Chen, X. Wu, C. Wen, and Z. Li, “Intelligent detail
enhancement for exposure fusion,” IEEE Trans. Multimedia, vol. 20,
no. 2, pp. 484–495, Feb. 2018.
[16] K. Ma, H. Li, H. Yong, Z. Wang, D. Meng, and L. Zhang, “Robust multi-
exposure image fusion: a structural patch decomposition approach,”
IEEE Trans. Image Process., vol. 26, no. 5, pp. 2519–2532, May. 2017.
[17] Y. Kinoshita and H. Kiya, “Scene segmentation-based luminance adjust-
ment for multi-exposure image fusion,” IEEE Trans. Image Process.,
vol. 28, no. 8, pp. 4101–4116, Aug. 2019.
[18] S. Hao, Y. Guo, and Z. Wei, “Lightness-aware contrast enhancement for
images with different illumination conditions,” Multimedia Tools Appl.,
vol. 78, no. 3, pp. 3817–3830, Feb. 2019.
[19] K. G. Lore, A. Akintayo, and S. Sarkar, “LLNet: A deep autoencoder
approach to natural low-light image enhancement,” Pattern Recognit.,
(a) (b) (c) vol. 61, pp. 650–662, Jan. 2017.
[20] J. Cai, S. Gu, and L. Zhang, “Learning a deep single image contrast
Fig. 15. Comparison of haze removal results. (a) Haze images. (b) Results enhancer from multi-exposure images,” IEEE Trans. Image Process.,
by [41]. (c) Results by our model. vol. 27, no. 4, pp. 2049–2062, Apr. 2018.
[21] C. Wei, W. Wang, W. Yang, and J. Liu, “Deep retinex decomposition
for low-light enhancement,” in Proc. Brit. Mach. Vis. Conf., 2018, pp.
1–12.
is limited in generating enhanced images at the semantic- [22] R. Wang, Q. Zhang, C.-W. Fu, X. Shen, W.-S. Zheng, and J. Jia,
aware and aesthetic-aware level. Instead, it tries to enhance all “Underexposed photo enhancement using deep illumination estimation,”
in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2019, pp. 6842–
the dark regions with no preference. Aiming at personalizing 6850.
enhancing strengths for different image regions, one possible [23] W. Ren, S. Liu, L. Ma, Q. Xu, X. Xu, X. Cao, J. Du, and M.-H. Yang,
solution is fully leveraging social information and embedding “Low-light image enhancement via a deep hybrid network,” IEEE Trans.
Image Process., vol. 28, no. 9, pp. 4364–4375, Sep. 2019.
it into the enhancement model [44]. [24] Y. Jiang, X. Gong, D. Liu, Y. Cheng, C. Fang, X. Shen, J. Yang, P. Zhou,
and Z. Wang, “EnlightenGAN: Deep light enhancement without paired
supervision,” 2019, arXiv:1906.06972.
R EFERENCES
[25] H. Zhang, X. Shang, H. Luan, M. Wang, and T.-S. Chua, “Learning
[1] X. Fu, Y. Liao, D. Zeng, Y. Huang, X.-P. Zhang, and X. Ding, “A prob- from collective intelligence: Feature learning using social images and
abilistic method for image enhancement with simultaneous illumination tags,” ACM Trans. Multimedia Comput. Commun. Appl., vol. 13, no. 1,
and reflectance estimation,” IEEE Trans. Image Process., vol. 24, no. 12, pp. 1:1–1:23, Jan. 2017.
pp. 4965–4977, Dec. 2015. [26] H. Yang, B. Wang, N. Vesdapunt, M. Guo, and S. B. Kang, “Personalized
[2] X. Fu, D. Zeng, Y. Huang, X.-P. Zhang, and X. Ding, “A weighted exposure control using adaptive metering and reinforcement learning,”
variational model for simultaneous reflectance and illumination estima- IEEE Trans. Vis. Comput. Graph., vol. 25, no. 10, pp. 2953–2968, Oct.
tion,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 2019.
2782–2790. [27] L. I. Rudin, S. Osher, and E. Fatemi, “Nonlinear total variation based
[3] H. Yue, J. Yang, X. Sun, F. Wu, and C. Hou, “Contrast enhancement noise removal algorithms,” Phys. D, Nonlinear Phenomena, vol. 60, no.
based on intrinsic image decomposition,” IEEE Trans. Image Process., 1-4, pp. 259–268, Nov. 1992.
vol. 26, no. 8, pp. 3981–3994, Aug. 2017. [28] K. He, J. Sun, and X. Tang, “Guided image filtering,” IEEE Trans.
[4] B. Cai, X. Xu, K. Guo, K. Jia, B. Hu, and D. Tao, “A joint intrinsic- Pattern Anal. Mach. Intell., vol. 35, no. 6, pp. 1397–1409, Jun. 2013.
extrinsic prior model for retinex,” in Proc. IEEE Int. Conf. Comput. Vis., [29] C. Tomasi and R. Manduchi, “Bilateral filtering for gray and color
2017, pp. 4020–4029. images.” in Proc. IEEE Int. Conf. Comput. Vis., 1998, pp. 839–846.
[5] Q. Zhang, G. Yuan, C. Xiao, L. Zhu, and W.-S. Zheng, “High-quality [30] Z. Farbman, R. Fattal, D. Lischinski, and R. Szeliski, “Edge-preserving
exposure correction of underexposed photos,” in Proc. 26th ACM Int. decompositions for multi-scale tone and detail manipulation,” ACM
Conf. Multimedia, 2018, pp. 582–590. Trans. Graph., vol. 27, no. 3, pp. 67:1–67:10, Aug. 2008.
[6] M. Li, J. Liu, W. Yang, X. Sun, and Z. Guo, “Structure-revealing low- [31] L. Xu, Q. Yan, Y. Xia, and J. Jia, “Structure extraction from texture via
light image enhancement via robust retinex model,” IEEE Trans. Image relative total variation,” ACM Trans. Graph., vol. 31, no. 6, pp. 139:1–
Process., vol. 27, no. 6, pp. 2828–2841, Jun. 2018. 139:10, Nov. 2012.
[7] X. Dong, G. Wang, Y. Pang, W. Li, J. Wen, W. Meng, and Y. Lu, “Fast [32] B. Cai, X. Xing, and X. Xu, “Edge/structure preserving smoothing via
efficient algorithm for enhancement of low lighting video,” in Proc. relativity-of-Gaussian,” in Proc. IEEE Int. Conf. Image Process., 2017,
IEEE Int. Conf. Multimedia Expo, 2011, pp. 1–6. pp. 250–254.

1520-9210 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Glasgow. Downloaded on June 02,2020 at 11:36:10 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMM.2020.2969790, IEEE
Transactions on Multimedia
IEEE TRANSACTIONS ON MULTIMEDIA 14

[33] X. Guo, Y. Li, J. Ma, and H. Ling, “Mutually guided image filter- Shijie Hao is an associate professor at School
ing,” IEEE Trans. Pattern Anal. Mach. Intell., to be published. doi: of Computer Science and Information Engineering,
10.1109/TPAMI.2018.2883553. Hefei University of Technology (HFUT). He is also
[34] X. Guo, S. Li, L. Li, and J. Zhang, “Structure-texture decomposition with Key Laboratory of Knowledge Engineering
via joint structure discovery and texture smoothing,” in Proc. IEEE Int. with Big Data (Hefei University of technology),
Conf. Multimedia Expo, 2018, pp. 1–6. Ministry of Education. He received his Ph.D. degree
[35] R. Barrett, M. W. Berry, T. F. Chan, J. Demmel, J. Donato, J. Dongarra, at HFUT in 2012. His research interests include
V. Eijkhout, R. Pozo, C. Romine, and H. Van der Vorst, Templates for image processing and multimedia content analysis.
the Solution of Linear Systems: Building Blocks for Iterative Methods.
Philadelphia, PA, USA: SIAM, 1994.
[36] C. Lee, C. Lee, and C.-S. Kim, “Contrast enhancement based on layered
difference representation,” in Proc. IEEE Int. Conf. Image Process.,
2012, pp. 965–968.
[37] Q. Wang, X. Fu, X.-P. Zhang, and X. Ding, “A fusion-based method
for single backlit image enhancement,” in Proc. IEEE Int. Conf. Image
Xu Han is pursuing his Master Degree at School
Process., 2016, pp. 4077–4081.
of Computer and Information, Hefei University of
[38] A. Mittal, R. Soundararajan, and A. C. Bovik, “Making a ‘completely
Technology. He is also with Key Laboratory of
blind’ image quality analyzer,” IEEE Signal Process. Lett., vol. 20, no. 3,
Knowledge Engineering with Big Data (Hefei Uni-
pp. 209–212, Mar. 2013.
versity of technology), Ministry of Education. His
[39] K. Gu, G. Zhai, W. Lin, X. Yang, and W. Zhang, “No-reference image
research interests include digital image processing
sharpness assessment in autoregressive parameter space,” IEEE Trans.
and analysis.
Image Process., vol. 24, no. 10, pp. 3218–3231, Oct. 2015.
[40] K. Gu, G. Zhai, X. Yang, and W. Zhang, “Using free energy principle
for blind image quality assessment,” IEEE Trans. Multimedia, vol. 17,
no. 1, pp. 50–63, Jan. 2015.
[41] K. He, J. Sun, and X. Tang, “Single image haze removal using dark
channel prior,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 33, no. 12,
pp. 2341–2353, Dec. 2011.
[42] Y. Li, F. Guo, R. T. Tan, and M. S. Brown, “A contrast enhancement
framework with jpeg artifacts suppression,” in Proc. Eur. Conf. Comput. Yanrong Guo is an associate professor at School
Vis., 2014, pp. 174–188. of Computer and Information, Hefei University of
[43] C. Xu, S. Hao, Y. Guo, and R. Hong, “Enhancing low-light images with Technology (HFUT). She is also with He is also with
jpeg artifact based on image decomposition,” in Proc. Pacific Rim Conf. Key Laboratory of Knowledge Engineering with Big
Multimedia, 2018, pp. 3–12. Data (Hefei University of technology), Ministry of
[44] H. Zhang, Z.-J. Zha, Y. Yang, S. Yan, and T.-S. Chua, “Robust (semi) Education. She received her Ph.D. degree at HFUT
nonnegative graph embedding,” IEEE Trans. Image Process., vol. 23, in 2013. She was a postdoc researcher at Univer-
no. 7, pp. 2996–3012, Jul. 2014. sity of North Carolina at Chapel Hill (UNC) from
2013 to 2016. Her research interests include image
processing and pattern recognition.

Xin Xu received the B.Sc. and Ph.D. degree in com-


puter science and engineering from Shanghai Jiao
Tong University, China, in 2004 and 2012 respec-
tively. He is an associate professor in the School of
Computer Science and Technology, Wuhan Univer-
sity of Science and Technology, Wuhan, China. His
current research interests include computer vision,
deep learning, and visual surveillance.

Meng Wang received the BE and PhD degrees


in the special class for the Gifted Young and the
Department of Electronic Engineering and Informa-
tion Science from the University of Science and
Technology of China (USTC), Hefei, China, in 2003
and 2008, respectively. He is a professor at the Hefei
University of Technology, China. He is also with
Key Laboratory of Knowledge Engineering with
Big Data (Hefei University of technology), Ministry
of Education. His current research interests include
multimedia content analysis, computer vision, and
pattern recognition. He has authored more than 200 book chapters, journal
and conference papers in these areas. He is the recipient of the ACM SIGMM
Rising Star Award 2014. He is an associate editor of the IEEE T RANSAC -
TIONS ON K NOWLEDGE AND DATA E NGINEERING (IEEE TKDE), the IEEE
T RANSACTIONS ON C IRCUITS AND S YSTEMS FOR V IDEO T ECHNOLOGY
(IEEE TCSVT), and the IEEE T RANSACTIONS ON N EURAL N ETWORKS
AND L EARNING S YSTEMS (IEEE TNNLS).

1520-9210 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Glasgow. Downloaded on June 02,2020 at 11:36:10 UTC from IEEE Xplore. Restrictions apply.

You might also like