Professional Documents
Culture Documents
Abstract—High dynamic range (HDR) image generation from able to comprehensively learn the LDR-HDR mapping while
arXiv:1912.11463v1 [cs.CV] 24 Dec 2019
a single exposure low dynamic range (LDR) image has been outperforming the existing methods.
made possible due to the recent advances in Deep Learning. Deeper networks are known to learn more complex non-
Various feed-forward Convolutional Neural Networks (CNNs)
have been proposed for learning LDR to HDR representations. linear relationships like the LDR to HDR mapping. The
To better utilize the power of CNNs, we exploit the idea of caveat with deeper networks is that they consume a lot of
feedback, where the initial low level features are guided by computational resources and tend to over-fit the training data.
the high level features using a hidden state of a Recurrent To overcome this problem, we exploit the power of feedback
Neural Network. Unlike a single forward pass in a conventional mechanisms, inspired by [6], for the task of HDR image
feed-forward network, the reconstruction from LDR to HDR
in a feedback network is learned over multiple iterations. This reconstruction. A feedback block is an RNN whose output is
enables us to create a coarse-to-fine representation, leading to an fed back to guide its input via a hidden state. A feedback
improved reconstruction at every iteration. Various advantages network can run for many iterations for a single training
over standard feed-forward networks include early reconstruc- example. Considering the number of iterative operations on
tion ability and better reconstruction quality with fewer network the shared network parameters, a feedback network is virtually
parameters. We design a dense feedback block and propose an
end-to-end feedback network- FHDR for HDR image generation deeper than the corresponding feed-forward network with the
from a single exposure LDR image. Qualitative and quantitative same physical depth. Here, virtual depth = physical depth ×
evaluations show the superiority of our approach over the state- number of iterations.
of-the-art methods. For improving the reconstruction at every iteration, the loss
Index Terms—HDR imaging, Feedback Networks, RNN, Deep is calculated for the output of each iteration. By doing so,
Learning.
the network is forced to create a coarse-to-fine representation,
being able to reconstruct HDR content right from the first
I. I NTRODUCTION
iteration and improve with every subsequent iteration. We
Common digital cameras can not capture the wide range propose a global feedback block which consists of smaller
of light intensity levels in a natural scene. This can lead local feedback blocks of densely connected layers, inspired
to a loss of pixel information in under-exposed and over- by the Dense-Net architecture [7]. Dense connections allow
exposed regions of an image, resulting in a low dynamic the network to reuse features. This helps in learning robust
range (LDR) image. To recover the lost information and image representations, even with lesser network parameters.
represent the wide range of illuminance in an image, high The performance of our framework is evaluated on the stan-
dynamic range (HDR) images need to be generated. There dard City Scene dataset [8] and another dataset prepared from
has been active research going on in the area of deep learning the list of HDR image datasets suggested in [1]. Qualitative
for HDR imaging. The advances in deep learning for image and quantitative assessments of the network suggest that even
processing tasks have paved way for various approaches for with fewer network parameters, the proposed FHDR model
HDR image reconstruction using feed-forward convolutional outperforms the state-of-the-art methods.
neural network (CNN) architectures [1] [2] [3] [4] [5]. The
II. HDR RECONSTRUCTION FRAMEWORK
above methods specifically transform a single exposure LDR
image into an HDR image. HDRCNN [1] proposed a deep A. Feedback system
autoencoder for HDR image reconstruction which uses a Feedback systems are adopted to influence the input based
weighted mask to recover only the over-exposed regions of an on the generated output, unlike the conventional feed-forward
LDR image. The authors of DRTMO [3] designed a framework networks, where information flow is unidirectional and is not
with two networks for generating up-exposure and down- directly influenced by the generated output. The authors in
exposure LDR images, which are merged to form an HDR [6] exploited the feedback mechanism using an RNN, where
image. The network is not end-to-end trainable and uses a the output of the feedback block travels back to guide its
large number of parameters. Unlike the others, the proposed input via a hidden state. Their architecture uses ConvLSTM
FHDR model provides an end-to-end trainable solution and is cells as the basic RNN units and is designed for the task of
t
image classification. Even with lesser network parameters as Here, Fres represents the global residual feature map learned
compared to other feed-forward networks, such networks are at iteration t. Fin(1) stands for the low level LDR features from
t
able to learn better representations. Recently, authors of [9] the first convolutional layer in FEB. Fres is passed to the HRB
designed a feedback network specifically for the task of image to generate an HDR image at every iteration as below.
super-resolution which achieved state-of-the-art performance. t t
Hgen = fHRB (Fres ) (4)
Inspired by the success of feedback networks, we designed
a feedback network for learning the LDR to HDR mapping t
Here, fHRB represents the operations of the HRB and Hgen
that has been explained in detail in the following sections. represents the HDR image generated at tth iteration. For every
B. Model architecture LDR image, a forward pass can run for n iterations, therefore
generating n HDR images. Each generated image is coupled
with a loss, hence resulting in improved reconstruction at each
iteration through back-propagation in time.
C. Feedback block
Fig. 4: Qualitative evaluation against five described methods on the curated HDR dataset. (HDR images have been tonemapped
using Reinhard tonemapping algorithm [17] for displaying.)
B. Qualitative evaluation
As can be seen in Fig. 4, non-learning based methods
are unable to recover highly over-exposed regions. DRTMO
brightens the whole image and is able to recover only the
under-exposed, regions. HDRCNN, on the other hand, is
designed to recover only the saturated regions and thus under
performs in recovering information in under-exposed dark
regions of the image. Unlike others, our FHDR/W pays equal
attention to both under-exposed and over-exposed regions of
Fig. 5: Convergence analysis for four variants of FHDR, the image. The proposed FHDR network with the feedback
evaluated on the City Scene dataset. connections enhances the overall sharpness of the image and
performs an even better reconstruction of the input.
City Scene Dataset Curated HDR Dataset
HDRCNN [1] and DRTMO [3] and the feed-forward counter- Methods
PSNR SSIM Q-score PSNR SSIM Q-score
part of the proposed FHDR method. We use the HDR toolbox AKY [15] 15.35 0.44 35.40 9.58 0.20 33.47
KOV [16] 16.77 0.59 35.31 12.99 0.41 29.87
[20] for evaluating the non-learning based methods. The deep
HDRCNN 13.21 0.38 54.34 12.13 0.34 55.32
learning methods were trained and evaluated on the described [1]
datasets, as mentioned in earlier sections. DRTMO [3] - - - 11.4 0.28 58.85
DRHT [4] - 0.93 61.51 - - -
DRHT [4] is the state-of-the-art deep neural network for
FHDR/W 25.39 0.89 63.21 16.94 0.74 65.27
the image correction task. It reconstructs HDR content as an FHDR 32.54 0.95 67.18 20.3 0.79 70.97
intermediate output and transforms it back into an LDR image.
Due to unavailability of DRHT network implementation, re- TABLE I: Quantitative comparison against state-of-the-art
sults of their network on the curated dataset are not presented. methods.
Performance metrics of their LDR-to-HDR network, trained V. C ONCLUSION
on the City Scene dataset has been sourced from their paper We propose a novel feedback network FHDR, to reconstruct
because of the same experiment setup in terms of the training an HDR image from a single exposure LDR image. The
and testing datasets. dense connections in the forward-pass enable feature-reuse,
thus learning robust representations with minimum parameters.
A. Quantitative evaluation Local and global feedback connections enhance the learning
ability, guiding the initial low level features from the high
A quantitative comparison of the above mentioned HDR
level features. Iterative learning forces the network to create
reconstruction methods has been presented in Table 1. The
a coarse-to-fine representation which results in early recon-
proposed network outperforms the state-of-the-art methods
structions. Extensive experiments demonstrate that the FHDR
in evaluations performed on both the datasets. Results of
network is successfully able to recover the underexposed and
DRTMO [3] could not be calculated on the City Scene
overexposed regions outperforming state-of-the-art methods.
dataset because the network does not accept low resolution
images. Even though the feed-forward counterpart of FHDR VI. ACKNOWLEDGEMENT
outperforms other methods, the proposed FHDR network (4- This research was supported by the SERB Core Research
iterations) performs far better. Grant.
R EFERENCES
[1] G. Eilertsen, J. Kronander, G. Denes, R. K. Mantiuk, and J. Unger,
“Hdr image reconstruction from a single exposure using deep cnns,”
ACM Transactions on Graphics (TOG), vol. 36, no. 6, p. 178, 2017.
[2] D. Marnerides, T. Bashford-Rogers, J. Hatchett, and K. Debattista, “Ex-
pandnet: A deep convolutional neural network for high dynamic range
expansion from low dynamic range content,” in Computer Graphics
Forum, vol. 37, pp. 37–49, Wiley Online Library, 2018.
[3] Y. Endo, Y. Kanamori, and J. Mitani, “Deep reverse tone mapping.,”
ACM Trans. Graph., vol. 36, no. 6, pp. 177–1, 2017.
[4] X. Yang, K. Xu, Y. Song, Q. Zhang, X. Wei, and R. W. Lau, “Image
correction via deep reciprocating hdr transformation,” in Proceedings
of the IEEE Conference on Computer Vision and Pattern Recognition,
pp. 1798–1807, 2018.
[5] S. Lee, G. H. An, and S.-J. Kang, “Deep chain hdri: Reconstructing a
high dynamic range image from a single low dynamic range image,”
IEEE Access, vol. 6, pp. 49913–49924, 2018.
[6] A. R. Zamir, T.-L. Wu, L. Sun, W. B. Shen, B. E. Shi, J. Malik,
and S. Savarese, “Feedback networks,” in Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition, pp. 1308–
1317, 2017.
[7] G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Densely
connected convolutional networks,” in Proceedings of the IEEE confer-
ence on computer vision and pattern recognition, pp. 4700–4708, 2017.
[8] J. Zhang and J.-F. Lalonde, “Learning high dynamic range from outdoor
panoramas,” in Proceedings of the IEEE International Conference on
Computer Vision, pp. 4519–4528, 2017.
[9] Z. Li, J. Yang, Z. Liu, X. Yang, G. Jeon, and W. Wu, “Feedback network
for image super-resolution,” in The IEEE Conference on Computer
Vision and Pattern Recognition (CVPR), 2019.
[10] Y. Zhang, Y. Tian, Y. Kong, B. Zhong, and Y. Fu, “Residual dense
network for image super-resolution,” in Proceedings of the IEEE Con-
ference on Computer Vision and Pattern Recognition, pp. 2472–2481,
2018.
[11] F. Yu and V. Koltun, “Multi-scale context aggregation by dilated
convolutions,” arXiv preprint arXiv:1511.07122, 2015.
[12] N. K. Kalantari and R. Ramamoorthi, “Deep high dynamic range
imaging of dynamic scenes.,” ACM Trans. Graph., vol. 36, no. 4,
pp. 144–1, 2017.
[13] J. Johnson, A. Alahi, and L. Fei-Fei, “Perceptual losses for real-time
style transfer and super-resolution,” in European conference on computer
vision, pp. 694–711, Springer, 2016.
[14] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,”
arXiv preprint arXiv:1412.6980, 2014.
[15] A. O. Akyüz, R. Fleming, B. E. Riecke, E. Reinhard, and H. H. Bülthoff,
“Do hdr displays support ldr content?: a psychophysical evaluation,” in
ACM Transactions on Graphics (TOG), vol. 26, p. 38, ACM, 2007.
[16] R. P. Kovaleski and M. M. Oliveira, “High-quality reverse tone mapping
for a wide range of exposures,” in 2014 27th SIBGRAPI Conference on
Graphics, Patterns and Images, pp. 49–56, IEEE, 2014.
[17] E. Reinhard, M. Stark, P. Shirley, and J. Ferwerda, “Photographic tone
reproduction for digital images,” ACM Trans. Graph., vol. 21, pp. 267–
276, July 2002.
[18] M. D. Grossberg and S. K. Nayar, “What is the space of camera response
functions?,” in 2003 IEEE Computer Society Conference on Computer
Vision and Pattern Recognition, 2003. Proceedings., vol. 2, pp. II–602,
June 2003.
[19] M. Narwaria, R. Mantiuk, M. P. Da Silva, and P. Le Callet, “Hdr-vdp-
2.2: a calibrated method for objective quality prediction of high-dynamic
range and standard images,” Journal of Electronic Imaging, vol. 24,
no. 1, p. 010501, 2015.
[20] F. Banterle, A. Artusi, K. Debattista, and A. Chalmers, Advanced high
dynamic range imaging. AK Peters/CRC Press, 2017.