Professional Documents
Culture Documents
ABSTRACT
Underwater single image restoration poses a significant challenge due to the inherent
degradation of images in aquatic environments, characterized by color distortion,
blurring, and poor visibility. In this study, We suggest an updated strategy for
underwater single image restoration using a CycleGAN-based architecture.The
generator in our CycleGAN architecture adopts a DCGAN-like structure, employing
transposed convolutional layers for efficient upsampling. The discriminators are
designed as PatchGANs to evaluate the realism of translated images at a local level.
The model is trained with a combination of adversarial and cycle-consistency losses,
enforcing the preservation of important image characteristics during
translation.Experimental results highlight the effectiveness of our method in
mitigating issues like color distortion in various underwater scenarios. This approach
offers a valuable contribution to the field of underwater image restoration, with
implications for improved image quality in marine applications.
1. Introduction
In recent times, there has been a growing interest in the exploration of the underwater
world, leading to the emergence of visual tasks centered around underwater scenes.
For instance, An underwater concurrent mapping and localization system was
introduced by Rahman et al. [1] for the purpose of closed-loop underwater scene
detection and relocation.A technique for identifying a diver's gestures with their hands
in underwater situations was presented by.Jiang et al. [2]. But the intricacy of
undersea scenery limits how accurate these jobs can be., where water selectively
absorbs light, resulting in a prevalent blue-green appearance in underwater images.
This phenomenon significantly hampers both the visual experience as well as the
precision of different visual activities. As a result, processing underwater photos is
required to improve their visual quality.
In the area of atmospheric conditions, He et al. [3] proposed the dark channel prior
(DCP) by analysing several outdoor hazy photographs. Underwater situations,
however, are unsuitable for this approach. Drews et al. [4] then developed an
underwater dark channel prior (UDCP) technique that combined DCP with the
underwater physical model (UPM) specifically designed for underwater scenes.
Although this method works better in certain situations, it has drawbacks as well. The
contrast-limited adaptive histogram equalisation technique was presented by Kim et al.
[5] as a way to improve picture contrast. However, this strategy may not produce the
best results in more complicated underwater sceneries.
In recent years, methods grounded in deep learning have gained popularity. Compared
to traditional approaches, these methods produce higher-quality restored images,
better meeting user needs. The application of Generative Adversarial Networks
(GANs) [6], initially employed for tasks such as image translation [7] and super-
resolution [8], has extended to underwater image restoration, aligning with the
broader category of image migration. Consequently, training networks with image
pairs offers a potential solution to this task. However, acquiring image pairs is
challenging, given the difficulty and limited availability of such samples.
To address this challenge, we propose employing the Cycle Generative Adversarial
Network (CycleGan) for real-time improvement of underwater photographs. The goal
is for the model to familiarize itself with the various effects inherent in an underwater
photograph and subsequently generate an enhanced image that is devoid of these
effects.
2. Related Works
2.1 Traditional methods
Before the introduction of the Underwater Physical Model (UPM) [9], attempts to
restore underwater images primarily relied on straightforward image processing
techniques, involving alterations to the pixel values of the images. Iqbal et al. [10]
pursued image enhancement through an Integrated Color Model (ICM), while Huang
et al. [11] presented a method known as Relative Global Histogram Stretching
(RGHS). Additionally, Bai et al. [12] employed a technique that involved
regionalizing the pixel intensity center, employing histogram global and local
equalization, and incorporating multi-scale fusion for underwater image restoration.
Notably, these methods operated in diverse color spaces, offering swift processing
speeds and low computational complexity. However, they were not without
drawbacks, requiring manual selection of parameters to achieve optimal image
outcomes.
Some UPM-based picture restoration techniques then surfaced. The primary concept
is to estimate the image's transmission value and background light intensity in order
to invert the model. To estimate the parameters, Akkaynak et al. [13] analysed
changes in the light source and dark pixels in the picture. Better repair results are
achieved, although certain tools are needed. Peng et al. [14] suggested a method of
leveraging picture blurriness and light absorption to determine scene depth to
complete image restoration. Its estimating accuracy is better than that of DCP. The
precision of underwater model parameter estimation sets a limit on the picture
quality retrieved by these approaches, and their generalisation ability is subpar.
Discriminator
Convolutional neural networks (CNNs) serve as the foundation for discriminator
networks. In order to identify between two sets of photos and extract deep
characteristics from the image, it employs five convolutional layers.
Fig 3.3 Discriminator
4. Objective Function Formulation
Our purpose is to acquire mapping functions between two domains, X and Y,
utilising training samples xi, where xi ∈ X, and yi, where yj ∈ Y. The data
distribution is represented as y ~ pdata(y) and x ~ pdata(x). Two mappings are
included in our model: F: Y → X and G: X → Y. Furthermore, we present adversarial
discriminators.
The two primary components of the objective function are cycle consistency losses,
which are intended to prevent conflicts between the learnt mappings G and F, and
adversarial losses, which seek to match the distribution of produced pictures with the
data distribution in the target domain.
4.1Adversarial Loss
The goal of a traditional conditional Generative Adversarial Network (GAN)-based
model is to learn a mapping G: X, Z → Y, where Z stands for random noise and Xy is
the source (desired) domain. The following is the formulation of the conditional
adversarial loss function:
L(GAN)N(G, D[y], X, Y) = E(X, Y)[logD[y](Y) + E(X, Y)[log1 − D(X, G(X, Z))) (1)
In this case, the discriminator Dy seeks to maximise LcGAN, while the generator G
seeks to minimise it.
4.2 Cycle Consistency Loss
Theoretically, one might learn mappings G and F using adversarial training such that
the resulting outputs have distributions exactly like those of the target domains Y and
X, respectively (strictly speaking, this needs G and F to be stochastic functions). It
may, however, transfer the same set of input photos to any random permutation of
images in the target domain if the network has enough capacity. Any of the learnt
mappings in this case can cause an output distribution to resemble the target
distribution. As such, it is not possible to guarantee that the learnt function
successfully transfers a single input xi to the intended output yi by depending just on
adversarial losses.We suggest that cycle consistency be exhibited by the learnt
mapping functions in order to reduce the range of possible mapping functions. The
image translation cycle should be able to convert each picture x from domain X back
to its original form. This extra criteria improves the mapping functions' ability to
reliably and precisely convert different inputs into the intended outputs.
i.e., x → G(x) → F(G(x)) ≈x. (2)
L(Cyc)(G, F) = E(x~p(data))(X)[||F(G(x)) − x||] + E(y~p(data(Y))[||F(G(y)) −
y||] (3)
4.3 Full Objective
Our Full objective is:
L(G, F, D(X), D(Y)) = L(GAN)(G, D(Y), X, Y) + L(GAN)(F, D(X), Y, X) +
𝜆L(Cyc)(G, F) (4)
where λ controls the relative importance of the two objectives. We aim to solve:
𝐺∗ , 𝐹∗ = 𝑎𝑟𝑔 min max 𝐿(𝐺, 𝐹, 𝐷(𝑥), 𝐷(𝑦) (5)
𝐺,𝐹 𝐷(𝑋),𝐷(𝑌)
4.4 Training
With a batch size of eight, we used the Adam optimizer to minimise the objective
function, which is represented by the number five. Every network was trained from
the beginning, using a 0.0002 learning rate. After staying constant for the first 100
epochs, the learning rate decreases linearly to zero during the next 100 epochs. The
objective of this training configuration is to efficiently optimise the model variables
during the training phase.
5.Dataset
5.1 EUVP Dataset
The EUVP (Enhancing Underwater Visual Perception) collection consists of discrete
sets of paired and unpaired picture samples that demonstrate different levels of
perceptual quality, encompassing both high and low quality examples. The purpose of
this dataset is to facilitate the supervised training of algorithms to improve the quality
of underwater images.
Paired Dataset
Name Training Pair Validation Total
Underwater Dark 5500 500 11500
Underwater Image-net 3700 1200 8600
Underwater Scenes 285 130 4500
Table 5.1.1
Unpaired Dataset
Poor Quality Good Quality Validation Total
3195 3140 330 6665
Table 5.1.2
5.2 UIEB Dataset
The UIEB (Underwater Image Enhancement Benchmark) [20] consists of two subsets.
There are 890 raw underwater photos in the first collection, along with accompanying
high-quality reference photographs. The second subset is composed of 60 challenging
underwater images, designed to present more complex scenarios for benchmarking
image enhancement algorithms.
6.Metrics and Results
6.1 Evaluation Metrics
In our study, we apply four standard measures, namely Peak Signal-to-Noise Ratio
(PSNR), Structural Similarity (SSIM), and Underwater Image Quality Metric (UIQM).
By quantitatively comparing the restored images with their corresponding ground
truths, these metrics offer a thorough evaluation of the image improvement
performance.
6.1.1 Peak Signal-to-Noise Ratio (PSNR)
The PSNR (Peak Signal-to-Noise Ratio) serves as an approximation for the
reconstruction quality of a generated image x when compared to its ground truth, and
it is calculated based on their Mean Squared Error (MSE) using the following formula:
PSNR(x, y) = 10 log10 [ 2552 MSE(x,y)] (6)
6.1.2 Structural similarity index measurement (SSIM)
SSIM (Structural Similarity) compares image patches by evaluating three properties:
luminance, contrast, and structure. Its definition is as follows:
2μ(x)μ(y)+c1 2σ(xy)+c2
SSIM(x, y) = ( μ(x)2+μ(y)2+c1 )( σ(x)2+σ(y)2+c2) (7)
6.1.3 Underwater Image Quality Metric (UIQM)
Underwater image colorfulness measure (UICM), underwater image sharpness
measure (UISM), and underwater image contrast measure (UIConM) are the three
attribute measurements that make up the underwater image quality measure (UIQM).
The colour casting problem that is common in underwater photos is addressed by the
Underwater Image Colorfulness Measure (UICM). According to their individual
wavelengths, colours gradually weaken as the water's depth rises. In particular, red's
shorter wavelength causes it to fade first. As a result, photos taken underwater
frequently have a blue or greenish tint. Further contributing factor to notable colour
desaturation in underwater photos is limited illumination.
We use the two opponent colour components connected to chrominance, RG and YB,
to measure the UICM, as shown in...
RG = R − G (8)
R+G
YB =
2
--B (9)
6.2 Summary
In conclusion, a higher PSNR score indicates that the result is more congruent with
the reference image's content, a higher SSIM score indicates that the result is more
comparable to the reference image's structure and texture, and a higher UIQM score
indicates that the result is more in line with human visual perception.
6.3 Experimental Results
6.3.1 Metrics Comparison
Metrics UDCP UWGAN CycleGAN
SSIM 0.24 0.74 0.87
PSNR 14.32 23.74 28.77
UQIM 2.27 3.10 3.07
Table 6.3
6.3.2 Results Comparison
UDCP input UDCP output UWGAN input UWGAN output CycleGAN input CycleGAN output
Fig 6.3.2 Results of Various Methods
7. Conclusion
In conclusion, this research offers valuable insights into the effectiveness of diverse
underwater image restoration techniques. While DCP fell short of expectations for
underwater applications, UWGAN[21] and Cycle GAN demonstrated notable
potential. UWGAN [21]excelled in the generation of synthetic underwater images,
while Cycle GAN emerged as the preferred choice for restoring both paired and
unpaired underwater datasets, showcasing superior performance across a range of
evaluation metrics.The implementation of Cycle GAN proved to be the most
promising technique, yielding superior results across all three metrics, with notable
excellence in PSNR evaluations. The utilization of two generators and discriminators,
initially trained on paired datasets and later tested on unpaired data, underscored the
effectiveness of Cycle GAN in tackling the complexities inherent in underwater
image restoration. This approach highlighted the model's adaptability to unpaired data,
thereby enhancing its practical utility.
REFERENCES:
1. Rahman, S., Li, A.Q., Rekleitis, I.: Svin2: an underwater slam system using sonar,
visual, inertial, and depth sensor. In: Proceedings of the IEEE International
Conference on Intelligent Robots and Systems (IROS), pp. 1861–1868 (2019)
2. Jiang, Y., Zhao, M., Wang, C., Wei, F., Wang, K., Qi, H.: Diver’s hand gesture
recognition and segmentation for human–robot interaction on auv. Signal, Image and
Video Processing, 1–8 (2021)
3. He, K., Sun, J., Tang, X.: Single image haze removal using dark channel prior.
IEEE Trans. Pattern Anal. Mach. Intell. 33(12), 2341–2353 (2010)
4. Drews, P., Nascimento, E., Moraes, F., Botelho, S., Campos, M.: Transmission
estimation in underwater single images. In: Proceedings of the IEEE International
Conference on Computer Vision (ICCV), pp. 825–830 (2013)
5. Kim, J.-Y., Kim, L.-S., Hwang, S.-H.: An advanced contrast enhancement using
partially overlapped sub-block histogram equalization. IEEE Trans. Circuits Syst.
Video Technol. 11(4), 475–484 (2001)
6. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., WardeFarley, D., Ozair, S.,
Courville, A., Bengio, Y.: Generative adversarial nets. Adv. Neural Inf. Process.
Systems 27 (2014)
7. Yan, L., Zheng, W., Wang, F.-Y., Gou, C.: Joint image-to-image translation with
denoising using enhanced generative adversarial networks. Signal Process. Image
Commun. 91, 116072 (2021)
8. Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and
super-resolution. In: European Conference on Computer Vision(ECCV), pp. 694–711
(2016)
9. Schettini, R., Corchs, S.: Underwater image processing: state of the art of
restoration and image enhancement methods. EURASIP J. Adv. Signal Process. 2010,
1–14 (2010)
10. Iqbal, K., Salam, R.A., Osman, A., Talib, A.Z.: Underwater image enhancement
using an integrated colour model. IAENG Int. J. Comput. Sci. 34(2), 239–244 (2007)
11. Huang, D., Wang, Y., Song, W., Sequeira, J., Mavromatis, S.: Shallow-water
image enhancement using relative global histogram stretching based on adaptive
parameter acquisition. In: International Conference on Multimedia Modeling(MMM),
pp. 453–465 (2018)
12. Bai, L., Zhang, W., Pan, X., Zhao, C.: Underwater image enhancement based on
global and local equalization of histogram and dual-image multi-scale fusion. IEEE
Access 8, 128973–128990 (2020)
13. Akkaynak, D., Treibitz, T.: Sea-thru: A method for removing water from
underwater images. In: Proceedings of the IEEE Conference on Computer Vision and
Pattern Recognition(CVPR), pp. 1682– 1691 (2019)
14. Peng, Y.-T., Cosman, P.C.: Underwater image restoration based on image
blurriness and light absorption. IEEE Trans. Image Process. 26(4), 1579–1594 (2017)
15. Hashisho, Y., Albadawi, M., Krause, T., von Lukas, U.F.: Underwater color
restoration using u-net denoising autoencoder. In: Proceedings of the International
Symposium on Image and Signal Processing and Analysis (ISPA), pp. 117–122 (2019)
16. Li, C., Guo, C., Ren, W., Cong, R., Hou, J., Kwong, S., Tao, D.: An underwater
image enhancement benchmark dataset and beyond. IEEE Trans. Image Process. 29,
4376–4389 (2019)
17. Li, C., Anwar, S., Porikli, F.: Underwater scene prior inspired deep underwater
image and video enhancement. Pattern Recogn. 98, 107038 (2020)
18. Li, J., Skinner, K.A., Eustice, R.M., Johnson-Roberson, M.: Watergan:
Unsupervised generative network to enable real-time color correction of monocular
underwater images. IEEE Robot. Autom. Lett. 3(1), 387–394 (2017)
19. Chen, X., Yu, J., Kong, S., Wu, Z., Fang, X., Wen, L.: Towards real-time
advancement of underwater visual quality with gan. IEEE Trans. Industr. Electron.
66(12), 9350–9359 (2019)
20. Li, C. Guo, W. Ren, R. Cong, J. Hou, S. Kwong, D. Tao, “An Underwater Image
Enhancement Benchmark Dataset and Beyond,” IEEE Trans. Image Process., vol. 29, pp.4376-
4389, 2019.
21.Wang, N., Zhou, Y., Han, F., Zhu, H., Yao, J.: Uwgan: underwater gan for real-
world underwater color restoration and dehazing. arXiv preprint arXiv:1912.10269
(2019)