You are on page 1of 3

A Comprehensive Analysis of Cloud Segmentation Using U-Net Architecture

Introduction
This study is conducted against the backdrop of the "Understanding Clouds from Satellite Images"
competition, which calls for the use of sophisticated methods to achieve accurate cloud segmentation. The
U-Net architecture was chosen to tackle this challenge because to its well-documented performance in
semantic segmentation tasks and its implementation in the PyTorch framework via the Catalyst library.
The reason for selecting U-Net is its ability to blend complexity and performance, making it an ideal
choice for the complexities of cloud pattern detection.
Data Overview
The photos in the dataset include labels that correlate to whether clouds are present and what kinds of
clouds they are (Fish, Flower, Gravel, Sugar). The labels are given in run-length encoded format, which
need decoding in order to generate binary masks. The label distribution is balanced, with equal
representation in the dataset for each cloud type.
Exploratory Data Analysis (EDA) found that the training dataset contains 5546 photos, with 5546
occurrences of each cloud type (Fish, Flower, Gravel, Sugar). However, the existence of NaN values in
the "EncodedPixels" column indicates that not all photos contain clouds. The dataset's distinguishing
characteristic is the ability of several cloud types to coexist in a single picture.
Motivation Behind Selection of Model
The U-Net architecture was chosen for the "Understanding Clouds from Satellite Images" competition
due to its shown performance in semantic segmentation tasks requiring pixel-level precision. The model's
unique architecture, which includes a contracting and expanding route, allows it to collect both contextual
information and exact localization, making it ideal for recognising and segmenting complicated cloud
patterns. Furthermore, its performance in numerous image segmentation problems, as well as its
incorporation into the PyTorch framework via the Catalyst library, give a solid platform for constructing
and experimenting with cutting-edge models. The selection is based on the idea that U-Net finds a
compromise between model complexity and performance, making it a good fit for this particular satellite
image processing task.
Model Architecture
A semantic segmentation model is appropriate for pixel-wise cloud type categorization in photos for this
job. The selected architecture is based on the U-Net, a common model for image segmentation tasks. The
U-Net design consists of a contracting path for capturing contextual information and an expanded path for
precise localization.
Catalyst, a deep learning research and development library, was used to implement the model. The model
is built with the PyTorch framework and contains the segmentation model library, which comprises pre-
trained models for image segmentation tasks.
Results
The U-Net model, which used a ResNet50 encoder pre-trained on ImageNet, performed well on our
dataset with four unique classes. The architecture of the model, when equipped with the provided
parameters, demonstrated good feature extraction and segmentation capabilities. The loss map, which
depicted training progress, revealed a convergence of both training and validation losses across the
training phases. This convergence indicates that the model learns the fundamental patterns in the data
efficiently, striking a balance between fitting the training set and generalising to new, previously
unknown data. Because of the model's capacity to capture fine features and nuances within the dataset, it
is a good fit for the segmentation task at hand. Results Shows the convergence at step 4 where training
loss and validation loss start to remain minimum which is 0.7 throught till 5 th step.

These encouraging results highlight the usefulness of the chosen model architecture and lay the
groundwork for additional investigation and development. Along with any other performance measures,
the segmentation accuracy supports the U-Net with a ResNet50 encoder as a promising solution for the
unique issues provided by our dataset.
Model and Literature Comparison:
We used a UNet architecture with a ResNet50 encoder in this work to use a semantic segmentation
strategy for cloud analysis in satellite data. The capacity to capture delicate characteristics, which is
required for correctly outlining cloud formations, influenced our choice of model design. The impetus for
choosing this model originates from its shown efficacy in different computer vision tasks, particularly in
cloud categorization tasks, as reported in the literature.
When we compare our methods to previous research, we see that our emphasis on semantic segmentation
coincides with a broader trend in the literature, where convolutional neural networks (CNNs) are often
used for cloud analysis. However, the specific problems addressed in past work differ. For example, Jiao
et al. tackle the difficult issue of categorising cloud organisation patterns by employing a scaled-up
version of a Convolutional Neural Network (CNN) called EfficientNet as the encoder and UNet as the
decoder. The emphasis on cloud organisation pattern categorization distinguishes it from our
segmentation-oriented method.
The model given by Zhen et al., on the other hand, deals with the difficult job of discriminating between
clouds and snow at the pixel level. To improve the accuracy of cloud and snow identification, this method
combines a fully convolutional neural network (FCN) with a multiscale prediction technique. While the
precise focus of this work differs from ours, the basic usage of deep learning for distant sensing is similar.
By offering a strong semantic segmentation model optimised for cloud analysis, our work adds to the
current body of literature. The comparison with previous work emphasises the wide range of deep
learning applications in remote sensing, from cloud organisation pattern categorization to accurate pixel-
level cloud and snow identification. Each solution targets a different set of problems, reflecting the
changing environment of deep learning applications in climate research and remote sensing.
References
https://www.kaggle.com/competitions/understanding_cloud_organization/code
https://github.com/milesial/Pytorch-UNet
Jiao, L., Huo, L., Hu, C., & Tang, P. (2020). Refined UNet: UNet-based refinement network for cloud
and shadow precise segmentation. Remote Sensing, 12(12), 2001.
Ahmed, T., & Sabab, N. H. N. (2022). Classification and understanding of cloud structures via satellite
images with EfficientUNet. SN Computer Science, 3, 1-11.
Tymchenko, B., Marchenko, P., & Spodarets, D. (2020). Segmentation of cloud organization patterns
from satellite images using deep neural networks. Herald of Advanced Information Technology, 1(3),
352-361.
Zhan, Y., Wang, J., Shi, J., Cheng, G., Yao, L., & Sun, W. (2017). Distinguishing cloud and snow in
satellite images via deep convolutional network. IEEE geoscience and remote sensing letters, 14(10),
1785-1789.

You might also like