You are on page 1of 17

TOMSK

POLYTECHNIC
UNIVERSITY

FULLY CONVOLUTIONAL NETWORKS FOR SEMANTIC


SEGMENTATION

Konstantin A. Maslov

18th June
2020
Outline
• Introduction
• The idea behind fully convolutional networks
• What is convolution?
• Challenges of the classic machine learning approach
• Problems of the convolutional neural networks
• How to build an image-to-image model
• Some well-known models
• U-Net, V-Net, SFCN
• Applications
• To biomedical image analysis
• To autonomous vehicles
• To remote sensing
• Problems
• Conclusion
2/17
Introduction
Image segmentation is a computer vision task in which we
label specific regions of an image according to what's being
shown. The goal of semantic image segmentation is to label
each pixel of an image with a corresponding class of what is
being represented. It finds its application in such areas as
biomedical image analysis, self-driving cars, remote sensing
and so on.

3/17
2D Convolution in Image Processing (1)
Convolution is a mathematical operation that does the
integral of the product of 2 functions(signals), with one of the
signals flipped

4/17
2D Convolution in Image Processing (2)
2D convolutions are used as image filters, and when you
would like to find a specific patch on an image.

5/17
Classical approaches
It’s possible to use classic approaches, such as Bayes classifier,
decision trees, support vector machines, etc., to classify pixels
of an image.
But if we use only color intensities of the pixels, then we lose
the information about pixels’ vicinity, consequently, we do not
consider textures of the objects and their shapes.
We could add some texture and shape features with
convolutions (or other techniques), but there are some
problems:
• Which kernels should we use?
• How to model hierarchical relationships between pixels and
objects?

6/17
Convolutional neural networks
One of solutions is to use convolutional neural networks.

LeCun Y., Bottou L., Bengio Y., and Haffner P. Gradient-Based Learning Applied to Document
Recognition // Proceedings of the IEEE Research – 1998 – Vol. 86 – Issue 11 – pp. 2278–2324

But there is still a problem:


• If we classify every pixel, then it’s computationally
expensive
7/17
Fully convolutional networks
• Get the latent space
representation of the image
with a convolutional neural
network
• Classify every “pixel” of the
latent space with 1*1
convolution
• Use transposed or one could use
convolutions (upsampling + a decoder
convolutions also can be submodel
used) to restore the
segmentation map from
the classification result

8/17
U-Net model

Ronneberger O., Fischer P., and Brox T. U-Net: Convolutional Networks for Biomedical Image
Segmentation // arXiv preprint arXiv:1505.04597. – 2015. – URL: https://arxiv.org/abs/1505.04597
9/17
V-Net model

Milletari F., Navab N., and Ahmadi S.A. Fully Convolutional Neural Networks for Volumetric Medical Image
Segmentation // arXiv preprint arXiv:1606.04797. – 2016. – URL: https://arxiv.org/abs/1606.04797
10/17
SFCN-256 model

Li YangYang, Chen YanQiao, Liu GuangYuan, and Jiao LiCheng A Novel Deep Fully Convolutional
Network for PolSAR Image Classification // Remote Sensing. – 2018. – Issue 10. –
DOI:10.3390/rs10121984
11/17
Application to biomedical image analysis

12/17
Application to autonomous vehicles

13/17
Application to remote sensing

14/17
Problems
• Fully convolutional networks are computationally expensive
to train and apply
• The process of producing ground-truth data is time-
consuming
• Fully convolutional networks are hard to debug
• Fully convolutional networks have lots of hyperparameters
that have to be tuned to obtain high accuracy

15/17
References to begin with
• Shelhamer E., Long J., and Darrell T. Fully Convolutional
Networks for Semantic Segmentation // IEEE Transactions
on Pattern Analysis and Machine Intelligence . – 2016. –
DOI:10.1109/tpami.2016.2572683
• Santos L.A. Image Segmentation [online]. Gitbooks, Artificial
Intelligence . – 2018. – URL:
https://leonardoaraujosantos.gitbooks.io/artificial-
inteligence/content/image_segmentation.html (viewed 18
June 2020)

16/17
Conclusion
• Fully convolutional networks are extremely effective models
to do semantic segmentation of digital images showing
state-of-the-art performance
• The use of fully convolutional networks is accompanied by
the need for intensive computations

17/17

You might also like