You are on page 1of 16

Underwater Single Image Restoration Using CycleGan

ABSTRACT
Underwater single image restoration poses a significant challenge due to the inherent
degradation of images in aquatic environments, characterized by color distortion,
blurring, and poor visibility. In this study, We suggest an updated strategy for
underwater single image restoration using a CycleGAN-based architecture.The
generator in our CycleGAN architecture adopts a DCGAN-like structure, employing
transposed convolutional layers for efficient upsampling. The discriminators are
designed as PatchGANs to evaluate the realism of translated images at a local level.
The model is trained with a combination of adversarial and cycle-consistency losses,
enforcing the preservation of important image characteristics during
translation.Experimental results highlight the effectiveness of our method in
mitigating issues like color distortion in various underwater scenarios. This approach
offers a valuable contribution to the field of underwater image restoration, with
implications for improved image quality in marine applications.

1. Introduction
In recent times, there has been a growing interest in the exploration of the underwater
world, leading to the emergence of visual tasks centered around underwater scenes.
For instance, An underwater concurrent mapping and localization system was
introduced by Rahman et al. [1] for the purpose of closed-loop underwater scene
detection and relocation.A technique for identifying a diver's gestures with their hands
in underwater situations was presented by.Jiang et al. [2]. But the intricacy of
undersea scenery limits how accurate these jobs can be., where water selectively
absorbs light, resulting in a prevalent blue-green appearance in underwater images.
This phenomenon significantly hampers both the visual experience as well as the
precision of different visual activities. As a result, processing underwater photos is
required to improve their visual quality.
In the area of atmospheric conditions, He et al. [3] proposed the dark channel prior
(DCP) by analysing several outdoor hazy photographs. Underwater situations,
however, are unsuitable for this approach. Drews et al. [4] then developed an
underwater dark channel prior (UDCP) technique that combined DCP with the
underwater physical model (UPM) specifically designed for underwater scenes.
Although this method works better in certain situations, it has drawbacks as well. The
contrast-limited adaptive histogram equalisation technique was presented by Kim et al.
[5] as a way to improve picture contrast. However, this strategy may not produce the
best results in more complicated underwater sceneries.

In recent years, methods grounded in deep learning have gained popularity. Compared
to traditional approaches, these methods produce higher-quality restored images,
better meeting user needs. The application of Generative Adversarial Networks
(GANs) [6], initially employed for tasks such as image translation [7] and super-
resolution [8], has extended to underwater image restoration, aligning with the
broader category of image migration. Consequently, training networks with image
pairs offers a potential solution to this task. However, acquiring image pairs is
challenging, given the difficulty and limited availability of such samples.
To address this challenge, we propose employing the Cycle Generative Adversarial
Network (CycleGan) for real-time improvement of underwater photographs. The goal
is for the model to familiarize itself with the various effects inherent in an underwater
photograph and subsequently generate an enhanced image that is devoid of these
effects.

2. Related Works
2.1 Traditional methods

Before the introduction of the Underwater Physical Model (UPM) [9], attempts to
restore underwater images primarily relied on straightforward image processing
techniques, involving alterations to the pixel values of the images. Iqbal et al. [10]
pursued image enhancement through an Integrated Color Model (ICM), while Huang
et al. [11] presented a method known as Relative Global Histogram Stretching
(RGHS). Additionally, Bai et al. [12] employed a technique that involved
regionalizing the pixel intensity center, employing histogram global and local
equalization, and incorporating multi-scale fusion for underwater image restoration.
Notably, these methods operated in diverse color spaces, offering swift processing
speeds and low computational complexity. However, they were not without
drawbacks, requiring manual selection of parameters to achieve optimal image
outcomes.

Some UPM-based picture restoration techniques then surfaced. The primary concept
is to estimate the image's transmission value and background light intensity in order
to invert the model. To estimate the parameters, Akkaynak et al. [13] analysed
changes in the light source and dark pixels in the picture. Better repair results are
achieved, although certain tools are needed. Peng et al. [14] suggested a method of
leveraging picture blurriness and light absorption to determine scene depth to
complete image restoration. Its estimating accuracy is better than that of DCP. The
precision of underwater model parameter estimation sets a limit on the picture
quality retrieved by these approaches, and their generalisation ability is subpar.

2.2 Deep Learning Methods

Deep learning techniques use data-driven strategies to recover underwater photos. To


reconstruct the picture, Hashishoet al. [15] suggested using an auto encoder called U-
Net denoising. A small collection of underwater images was created by Li et al. [16],
who also developed the Underwater Convolutional Neural Network [17] model.
WaterGAN was suggested by Li et al. [18] and is used to produce realistic underwater
photos. However, obtaining the image's depth information is necessary, and it is
challenging. A dual-branch discriminator was employed in Chen et al.'s [19]
suggested GAN restoration approach in order to maintain the picture content. High-
quality photos can be obtained with deep learning algorithms, but training them is
challenging due to their complicated network structure. Additionally, there aren't
many image pairings.
3. Proposed Method

Fig 3.1 CycleGan Model


A kind of deep learning model called CycleGAN, or Cycle Generative Adversarial
Network, is employed for image-to-image translation problems. CycleGAN may be
used to improve underwater image quality when it comes to underwater single image
restoration..
The core idea behind CycleGAN is to have two generators and two discriminator
working in a cycle. The generators aim to transform images from one domain to
another and vice versa, when the discriminator works to discern between artificially
created and actual pictures.We can see our model in fig 3.1.
Let's delve into a detailed explanation of how and why this method is effective for
underwater single image restoration.

3.1 Adversarial training


The primary objective of adversarial training in CycleGAN is to train two generators
(G_A and G_B) and discriminators in a competitive manner.
The generator aims to transform underwater images into clear images to deceive the
discriminator .
The discriminators' goal is to differentiate between pictures produced by G_A and
actual, clear images.
Why It Works
 Minimax Game:Through adversarial training, a minimax game is created in
which the discriminators want to improve their ability to discriminate between
genuine and produced pictures, while the generators try to produce realistic
images that mislead them.
 Image Realism Improvement: As the generators and discriminator play this
game iteratively, the generators get better at creating realistic images, leading to
improved image quality.
3.2 Cycle Consistency
When an image is translated from domain A to domain B and back again, the cycle
consistency loss ensures that the translation is consistent between domains, therefore
the result should resemble the original.
Forward Cycle (e.g., G_A -> G_B):An image from domain A (underwater) is
translated to domain B (clear) using G_A.
Backward Cycle (e.g., G_B -> G_A): The translated image from domain B is then
translated back to domain A using G_B.
Why It Works
 Preservation of Structure: Cycle consistency ensures that important structural
information in the image is preserved during translation.
 Avoiding Information Loss:It prevents information loss during the translation
process, making sure that the restored image is consistent with the original.
Why Both Adversarial and Cycle Consistency?
 Complementary Objectives:Adversarial training and cycle consistency are
complementary objectives. Adversarial training focuses on image realism, while
cycle consistency focuses on maintaining structural information.
 Balancing Realism and Consistency:Together, these objectives strike a balance
between generating realistic images and ensuring that the generated images are
consistent with the original ones.
Generator
We include the architecture for our generative networks from DCGAN . There are
many residual blocks and three convolutional layers in the model. It also has a final
convolutional layer that maps features to the RGB space and two fractionally-strided
convolutions with a stride of 1/2. We use 6 blocks for 128×128 photos and 9 blocks
for 256×256 and higher quality training images.
In our design, we use Leaky ReLU as the activation layer and integrate instance
normalisation. Hyperbolic tangent serves as the last layer's activation function. These
decisions are taken to provide a consistent and functional model design by remaining
true to the literature that is cited.

Fig 3.2 Generator

Discriminator
Convolutional neural networks (CNNs) serve as the foundation for discriminator
networks. In order to identify between two sets of photos and extract deep
characteristics from the image, it employs five convolutional layers.
Fig 3.3 Discriminator
4. Objective Function Formulation
Our purpose is to acquire mapping functions between two domains, X and Y,
utilising training samples xi, where xi ∈ X, and yi, where yj ∈ Y. The data
distribution is represented as y ~ pdata(y) and x ~ pdata(x). Two mappings are
included in our model: F: Y → X and G: X → Y. Furthermore, we present adversarial
discriminators.
The two primary components of the objective function are cycle consistency losses,
which are intended to prevent conflicts between the learnt mappings G and F, and
adversarial losses, which seek to match the distribution of produced pictures with the
data distribution in the target domain.

4.1Adversarial Loss
The goal of a traditional conditional Generative Adversarial Network (GAN)-based
model is to learn a mapping G: X, Z → Y, where Z stands for random noise and Xy is
the source (desired) domain. The following is the formulation of the conditional
adversarial loss function:
L(GAN)N(G, D[y], X, Y) = E(X, Y)[logD[y](Y) + E(X, Y)[log1 − D(X, G(X, Z))) (1)
In this case, the discriminator Dy seeks to maximise LcGAN, while the generator G
seeks to minimise it.
4.2 Cycle Consistency Loss
Theoretically, one might learn mappings G and F using adversarial training such that
the resulting outputs have distributions exactly like those of the target domains Y and
X, respectively (strictly speaking, this needs G and F to be stochastic functions). It
may, however, transfer the same set of input photos to any random permutation of
images in the target domain if the network has enough capacity. Any of the learnt
mappings in this case can cause an output distribution to resemble the target
distribution. As such, it is not possible to guarantee that the learnt function
successfully transfers a single input xi to the intended output yi by depending just on
adversarial losses.We suggest that cycle consistency be exhibited by the learnt
mapping functions in order to reduce the range of possible mapping functions. The
image translation cycle should be able to convert each picture x from domain X back
to its original form. This extra criteria improves the mapping functions' ability to
reliably and precisely convert different inputs into the intended outputs.
i.e., x → G(x) → F(G(x)) ≈x. (2)
L(Cyc)(G, F) = E(x~p(data))(X)[||F(G(x)) − x||] + E(y~p(data(Y))[||F(G(y)) −
y||] (3)
4.3 Full Objective
Our Full objective is:
L(G, F, D(X), D(Y)) = L(GAN)(G, D(Y), X, Y) + L(GAN)(F, D(X), Y, X) +
𝜆L(Cyc)(G, F) (4)
where λ controls the relative importance of the two objectives. We aim to solve:
𝐺∗ , 𝐹∗ = 𝑎𝑟𝑔 min max 𝐿(𝐺, 𝐹, 𝐷(𝑥), 𝐷(𝑦) (5)
𝐺,𝐹 𝐷(𝑋),𝐷(𝑌)

4.4 Training
With a batch size of eight, we used the Adam optimizer to minimise the objective
function, which is represented by the number five. Every network was trained from
the beginning, using a 0.0002 learning rate. After staying constant for the first 100
epochs, the learning rate decreases linearly to zero during the next 100 epochs. The
objective of this training configuration is to efficiently optimise the model variables
during the training phase.
5.Dataset
5.1 EUVP Dataset
The EUVP (Enhancing Underwater Visual Perception) collection consists of discrete
sets of paired and unpaired picture samples that demonstrate different levels of
perceptual quality, encompassing both high and low quality examples. The purpose of
this dataset is to facilitate the supervised training of algorithms to improve the quality
of underwater images.
Paired Dataset
Name Training Pair Validation Total
Underwater Dark 5500 500 11500
Underwater Image-net 3700 1200 8600
Underwater Scenes 285 130 4500

Table 5.1.1

Unpaired Dataset
Poor Quality Good Quality Validation Total
3195 3140 330 6665

Table 5.1.2
5.2 UIEB Dataset
The UIEB (Underwater Image Enhancement Benchmark) [20] consists of two subsets.
There are 890 raw underwater photos in the first collection, along with accompanying
high-quality reference photographs. The second subset is composed of 60 challenging
underwater images, designed to present more complex scenarios for benchmarking
image enhancement algorithms.
6.Metrics and Results
6.1 Evaluation Metrics
In our study, we apply four standard measures, namely Peak Signal-to-Noise Ratio
(PSNR), Structural Similarity (SSIM), and Underwater Image Quality Metric (UIQM).
By quantitatively comparing the restored images with their corresponding ground
truths, these metrics offer a thorough evaluation of the image improvement
performance.
6.1.1 Peak Signal-to-Noise Ratio (PSNR)
The PSNR (Peak Signal-to-Noise Ratio) serves as an approximation for the
reconstruction quality of a generated image x when compared to its ground truth, and
it is calculated based on their Mean Squared Error (MSE) using the following formula:
PSNR(x, y) = 10 log10 [ 2552 MSE(x,y)] (6)
6.1.2 Structural similarity index measurement (SSIM)
SSIM (Structural Similarity) compares image patches by evaluating three properties:
luminance, contrast, and structure. Its definition is as follows:
2μ(x)μ(y)+c1 2σ(xy)+c2
SSIM(x, y) = ( μ(x)2+μ(y)2+c1 )( σ(x)2+σ(y)2+c2) (7)
6.1.3 Underwater Image Quality Metric (UIQM)
Underwater image colorfulness measure (UICM), underwater image sharpness
measure (UISM), and underwater image contrast measure (UIConM) are the three
attribute measurements that make up the underwater image quality measure (UIQM).
The colour casting problem that is common in underwater photos is addressed by the
Underwater Image Colorfulness Measure (UICM). According to their individual
wavelengths, colours gradually weaken as the water's depth rises. In particular, red's
shorter wavelength causes it to fade first. As a result, photos taken underwater
frequently have a blue or greenish tint. Further contributing factor to notable colour
desaturation in underwater photos is limited illumination.
We use the two opponent colour components connected to chrominance, RG and YB,
to measure the UICM, as shown in...
RG = R − G (8)

R+G
YB =
2
--B (9)

Underwater images commonly experience substantial noise. In lieu of employing


conventional statistical values, the measurement of underwater image colorfulness
relies on asymmetric alpha-trimmed statistical values. The mean in this context is
defined as:
1
μ(α, RG) = ∑K−T(αR)
i=T(αL)+1
Intensity(RG , i) (10)
K−T(αL)−T(αR)

The second-order statistic variance σ2 in:


1
𝜎(𝛼, 𝑅𝐺)2 = ∑𝑁
𝑝=1
𝐼𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑦(𝑅𝐺, 𝑝) − 𝜇(𝛼, 𝑅𝐺))2 (11)
𝑁

The comprehensive metric employed for gauging underwater image colorfulness is


presented as:
0.1586
UICM =− 0.0268 μ(α, RG)2 + μ(α, YB)2 + σ(α, RG)2 + σ(α, YB)2 (12)

Sharpness on edges is measured via a number of stages in the Underwater Image


Sharpness Measure (UISM). Initially, every RGB colour component is subjected to an
edge detector. A grayscale edge map is produced by multiplying the resultant edge
map by the source picture. This method preserves just the pixels that were originally
part of the edge of the underwater image. Note that photos with a consistent backdrop
and non-periodic patterns are considered suitable candidates for the Enhancement
metric Estimation (EME) metric.
UISM = ∑3c=1 𝜆(c)EME(greyscale, edge(c)) (13)
2 I(max,k,l)
EME = ∑k1 ∑k2 log (
l=1 k=1
(14)
k1k2 I(min,k,l)

The Underwater Image Contrast Measure (UIConM) is designed to assess contrast, a


factor known to correlate with underwater visual performance, including stereoscopic
acuity. In the context of underwater images, the degradation of contrast is typically
attributed to backward scattering.
UIConM = logAMEE(Intensity) (15)
UIQM=c1*UICM+c2*UISM+c3*UIConM (16)
Where c1 c2 ,c3 constants

6.2 Summary
In conclusion, a higher PSNR score indicates that the result is more congruent with
the reference image's content, a higher SSIM score indicates that the result is more
comparable to the reference image's structure and texture, and a higher UIQM score
indicates that the result is more in line with human visual perception.
6.3 Experimental Results
6.3.1 Metrics Comparison
Metrics UDCP UWGAN CycleGAN
SSIM 0.24 0.74 0.87
PSNR 14.32 23.74 28.77
UQIM 2.27 3.10 3.07

Table 6.3
6.3.2 Results Comparison
UDCP input UDCP output UWGAN input UWGAN output CycleGAN input CycleGAN output
Fig 6.3.2 Results of Various Methods
7. Conclusion
In conclusion, this research offers valuable insights into the effectiveness of diverse
underwater image restoration techniques. While DCP fell short of expectations for
underwater applications, UWGAN[21] and Cycle GAN demonstrated notable
potential. UWGAN [21]excelled in the generation of synthetic underwater images,
while Cycle GAN emerged as the preferred choice for restoring both paired and
unpaired underwater datasets, showcasing superior performance across a range of
evaluation metrics.The implementation of Cycle GAN proved to be the most
promising technique, yielding superior results across all three metrics, with notable
excellence in PSNR evaluations. The utilization of two generators and discriminators,
initially trained on paired datasets and later tested on unpaired data, underscored the
effectiveness of Cycle GAN in tackling the complexities inherent in underwater
image restoration. This approach highlighted the model's adaptability to unpaired data,
thereby enhancing its practical utility.
REFERENCES:
1. Rahman, S., Li, A.Q., Rekleitis, I.: Svin2: an underwater slam system using sonar,
visual, inertial, and depth sensor. In: Proceedings of the IEEE International
Conference on Intelligent Robots and Systems (IROS), pp. 1861–1868 (2019)
2. Jiang, Y., Zhao, M., Wang, C., Wei, F., Wang, K., Qi, H.: Diver’s hand gesture
recognition and segmentation for human–robot interaction on auv. Signal, Image and
Video Processing, 1–8 (2021)
3. He, K., Sun, J., Tang, X.: Single image haze removal using dark channel prior.
IEEE Trans. Pattern Anal. Mach. Intell. 33(12), 2341–2353 (2010)
4. Drews, P., Nascimento, E., Moraes, F., Botelho, S., Campos, M.: Transmission
estimation in underwater single images. In: Proceedings of the IEEE International
Conference on Computer Vision (ICCV), pp. 825–830 (2013)
5. Kim, J.-Y., Kim, L.-S., Hwang, S.-H.: An advanced contrast enhancement using
partially overlapped sub-block histogram equalization. IEEE Trans. Circuits Syst.
Video Technol. 11(4), 475–484 (2001)
6. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., WardeFarley, D., Ozair, S.,
Courville, A., Bengio, Y.: Generative adversarial nets. Adv. Neural Inf. Process.
Systems 27 (2014)
7. Yan, L., Zheng, W., Wang, F.-Y., Gou, C.: Joint image-to-image translation with
denoising using enhanced generative adversarial networks. Signal Process. Image
Commun. 91, 116072 (2021)
8. Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and
super-resolution. In: European Conference on Computer Vision(ECCV), pp. 694–711
(2016)
9. Schettini, R., Corchs, S.: Underwater image processing: state of the art of
restoration and image enhancement methods. EURASIP J. Adv. Signal Process. 2010,
1–14 (2010)
10. Iqbal, K., Salam, R.A., Osman, A., Talib, A.Z.: Underwater image enhancement
using an integrated colour model. IAENG Int. J. Comput. Sci. 34(2), 239–244 (2007)
11. Huang, D., Wang, Y., Song, W., Sequeira, J., Mavromatis, S.: Shallow-water
image enhancement using relative global histogram stretching based on adaptive
parameter acquisition. In: International Conference on Multimedia Modeling(MMM),
pp. 453–465 (2018)
12. Bai, L., Zhang, W., Pan, X., Zhao, C.: Underwater image enhancement based on
global and local equalization of histogram and dual-image multi-scale fusion. IEEE
Access 8, 128973–128990 (2020)
13. Akkaynak, D., Treibitz, T.: Sea-thru: A method for removing water from
underwater images. In: Proceedings of the IEEE Conference on Computer Vision and
Pattern Recognition(CVPR), pp. 1682– 1691 (2019)
14. Peng, Y.-T., Cosman, P.C.: Underwater image restoration based on image
blurriness and light absorption. IEEE Trans. Image Process. 26(4), 1579–1594 (2017)
15. Hashisho, Y., Albadawi, M., Krause, T., von Lukas, U.F.: Underwater color
restoration using u-net denoising autoencoder. In: Proceedings of the International
Symposium on Image and Signal Processing and Analysis (ISPA), pp. 117–122 (2019)
16. Li, C., Guo, C., Ren, W., Cong, R., Hou, J., Kwong, S., Tao, D.: An underwater
image enhancement benchmark dataset and beyond. IEEE Trans. Image Process. 29,
4376–4389 (2019)
17. Li, C., Anwar, S., Porikli, F.: Underwater scene prior inspired deep underwater
image and video enhancement. Pattern Recogn. 98, 107038 (2020)
18. Li, J., Skinner, K.A., Eustice, R.M., Johnson-Roberson, M.: Watergan:
Unsupervised generative network to enable real-time color correction of monocular
underwater images. IEEE Robot. Autom. Lett. 3(1), 387–394 (2017)
19. Chen, X., Yu, J., Kong, S., Wu, Z., Fang, X., Wen, L.: Towards real-time
advancement of underwater visual quality with gan. IEEE Trans. Industr. Electron.
66(12), 9350–9359 (2019)
20. Li, C. Guo, W. Ren, R. Cong, J. Hou, S. Kwong, D. Tao, “An Underwater Image
Enhancement Benchmark Dataset and Beyond,” IEEE Trans. Image Process., vol. 29, pp.4376-
4389, 2019.
21.Wang, N., Zhou, Y., Han, F., Zhu, H., Yao, J.: Uwgan: underwater gan for real-
world underwater color restoration and dehazing. arXiv preprint arXiv:1912.10269
(2019)

You might also like