You are on page 1of 86

Vamsi Prakash Makkapati

CNN based non-linear image processing


for robust “Pixel Vision”
Masterarbeit

zur Erlangung des akademischen Grades


Diplom–Ingenieur

Studium Information Technology

————————————–

Alpen-Adria-Universität Klagenfurt
Fakultät für Technische Wissenschaften

Begutachter: Univ.–Prof. Dr. Ing. Kyandoghere Kyamakya

Institut für Intelligente Systemtechnologien


Transportation Informatics

Klagenfurt, im Januar 2009


Eidesstattliche Erklärung

Ich erkläre ehrenwörtlich, dass ich die vorliegende wissenschaftliche Arbeit selb-
stständig angefertigt und die mit ihr unmittelbar verbundenen Tätigkeiten selbst
erbracht habe. Ich erkläre weiters, dass ich keine anderen als die angegebenen Hil-
fsmittel benutzt habe. Alle aus gedruckten, ungedruckten oder dem Internet im
Wortlaut oder im wesentlichen Inhalt übernommenen Formulierungen und Konzepte
sind gemäß den Regeln für wissenschaftliche Arbeiten zitiert und durch Fußnoten
bzw. durch andere genaue Quellenangaben gekennzeichnet.
Die während des Arbeitsvorganges gewährte Unterstützung einschließlich sig-
nifikanter Betreuungshinweise ist vollständig angegeben.
Die wissenschaftliche Arbeit ist noch keiner anderen Prfungsbehörde vorgelegt
worden. Diese Arbeit wurde in gedruckter und elektronischer Form abgegeben. Ich
bestätige, dass der Inhalt der digitalen Version vollständig mit dem der gedruckten
Version bereinstimmt.
Ich bin mir bewusst, dass eine falsche Erklärung rechtliche Folgen haben wird.

(Unterschrift) (Ort, Datum)

i
Abstract
Many of the basic image processing tasks, such as edge detection, color conversion,
histogram operations, thresholding, correlation, interpolation for data extraction,
etc., act on the image at a pixel level to extract useful high level information. These
basic tasks are referred to as pixel vision in this report. One of the challenges for
performing these tasks is the time taken by the processor to operate over the whole
image. Another challenge is to handle the complexity of the input image. In order to
overcome these two limiting factors and to strike a balance between them, custom
made platforms are being developed. Even these custom made platforms suffer
from some limitations, when dealing with real time deployment. The issue with real
time deployment is that the system should be compatible with different algorithms
depending on different conditions, like time of the day, weather conditions, etc. This
calls for a flexible platform, which is not the case with these custom made platforms.
This flexibility is offered by the CNN platform. This platform developed by
Chua and Yang in 1988 being a parallel processor, can process the information in
real time (in many cases, even faster than real time (25 frames per second)) and
also has the much needed flexibility to be deployed in real time scenarios.
The main objective of this thesis is to use Cellular neural networks and develop
a couple of template sets for some basic image processing tasks such as edge de-
tection and contrast enhancement. Also a model of a basic cell, which constitutes
CNN array, has been implemented on the MATLAB/Simulink computing platform.
This implementation comes in handy when dealing with the deployment of CNN on
FPGAs at a later stage.
In this work we intend to answer the following research questions:

• How far CNN based processing can be used to realize edge detection?

• How robust would CNN based image processing be under difficult conditions,
especially difficult lighting conditions?

• How can CNN be efficiently implemented on top of MATLAB/Simulink, in


view of rapid prototyping?

In the quest to answer these questions, a novel technique for detecting the edges,
Translation residue method, has been developed. This technique has been inspired
from a well known traditional image processing task, morphological edge detection.
Modifications have been done to this method in order to incorporate it in CNN.
Another feature of the translation residue method is its flexibility when using linear

ii
templates. A qualitative comparison has been made between the results of the
proposed method and the results from the relevant literature in image processing
tasks.
In the process of analyzing the robustness of the proposed(translation residue
method) method, a technique for contrast enhancement, originally proposed by
Brendel et.al. [1], was implemented with some modifications. This could greatly
influence the performance of the algorithm in difficult lighting conditions.

Keywords: Edge detection, Contrast enhancement, Cellular Neural Networks, Trans-


lation residue method.

iii
Acknowledgments

I would like to thank my supervisor Univ. Prof. Dr. Ing. Kyandoghere Kya-
makya for giving me a chance to work with the institute and for his constant support
and advice throughout my work. I would also like to thank him for the financial
support offered. It was a great learning experience working with him. I have fur-
thermore to thank Dr.-Ing. Jean Chamberlain Chedjou for his valuable suggestions
and his motivating talks. I would also thank Dr. Do Trong Tuan for reviewing this
thesis. I am deeply indebted to my mentor, Koteswara Rao Anne M.sc., for his
constant moral support and valuable suggestions at the right time.
My colleagues from the Department of Transportation Informatics, M.Sc. Muham-
mad Ahsan Latif, Venkata Sai Dattu Potapragada, Hima Deepthi Vankayalapati,
Abin Thomas Mathew and Ranga Raj Gupta Singanamallu supported me in my
work. I want to thank them for all their help, support, interest and valuable sugges-
tions. A special thanks to M.Sc. Alireza Fasih for helping me in getting my work
started.
I thank my parents, Mallikarjuna Rao and Lakshmi for supporting me to study
overseas and I am indebted to them for inculcating in me the dedication and disci-
pline to do whatever I undertake well.

iv
Contents

Eidesstattliche Erklärung i

Abstract ii

Acknowledgments iv

1 Introduction 1
1.1 Context of the work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Problem statement / Objectives . . . . . . . . . . . . . . . . . . . . . 4
1.4 Major contributions of the thesis . . . . . . . . . . . . . . . . . . . . 4
1.5 Thesis outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2 Background / Preliminaries 6
2.1 CNN basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.1 The CNN of Chua and Yang . . . . . . . . . . . . . . . . . . . 6
2.1.2 Main generalizations . . . . . . . . . . . . . . . . . . . . . . . 13
2.1.3 A formal definition . . . . . . . . . . . . . . . . . . . . . . . . 16
2.1.4 Applications of CNN . . . . . . . . . . . . . . . . . . . . . . . 18
2.2 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3 Edge detection using CNN 23


3.1 Edge detection and its importance . . . . . . . . . . . . . . . . . . . . 23
3.2 How to judge the presence of an edge . . . . . . . . . . . . . . . . . . 24
3.3 Traditional methods for edge detection and their limitations . . . . . 24
3.4 Review of the different approaches for edge detection on CNN platform 25
3.5 Dilation residue method . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.6 Translation residue method . . . . . . . . . . . . . . . . . . . . . . . 31
3.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

v
4 Contrast enhancement using CNN 45
4.1 Contrast enhancement and traditional methods for enhancing the
contrast . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.2 Additive image enhancement using contrast and intensity . . . . . . . 47
4.2.1 Implementation details . . . . . . . . . . . . . . . . . . . . . . 47
4.2.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

5 Modeling CNN cell on Simulink 58


5.1 Need for implementation of CNN on MATLAB/Simulink . . . . . . . 58
5.2 Review of the process of modeling CNN in MATLAB . . . . . . . . . 59
5.3 Implementing CNN on top of MATLAB/Simulink . . . . . . . . . . . 59
5.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

6 Conclusions and future directions 67


6.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
6.2 Future directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

A A note on nonlinear templates 69


A.1 Purpose of introduction . . . . . . . . . . . . . . . . . . . . . . . . . 69
A.2 Uses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
A.3 Remarks with respect to MATLAB simulation . . . . . . . . . . . . . 70

B The art of CNN template design 71

C Pseudocode for CNN implementation 72

Bibliography 73

vi
List of Figures

1.1 Cooperative computer vision stack . . . . . . . . . . . . . . . . . . . 2

2.1 Electronic circuit model of an isolated cell . . . . . . . . . . . . . . . 7


2.2 A 4x4 array of cells . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3 The cells which come under r-neighborhood set of the blue cell when
r = 1,2,3 respectively . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.4 Representation of an isolated cell . . . . . . . . . . . . . . . . . . . . 9
2.5 A cell connected to its neighbors . . . . . . . . . . . . . . . . . . . . . 9
2.6 Figure showing Feedback synapses . . . . . . . . . . . . . . . . . . . . 10
2.7 Electronic circuit model of the non-isolated cell . . . . . . . . . . . . 10
2.8 Nonlinear output functions; (a) gaussian, (b) inverse gaussian, (c)
unity gain with saturation, (d) inverse sigmoid, (e) sigmoid, (f) high
gain with saturation . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.9 A NUP-CNN with two kinds of cells . . . . . . . . . . . . . . . . . . 15
2.10 A MNS-CNN with two kinds of neighbor sets . . . . . . . . . . . . . 15
2.11 Examples of Grids; (a) triangular, (b) hexagonal, (c) rectangular . . . 16
2.12 Boundary cells and regular cells . . . . . . . . . . . . . . . . . . . . . 18
2.13 The way the templates act on an image . . . . . . . . . . . . . . . . . 19
2.14 An example of Horizontal Connected Component Detector . . . . . . 20
2.15 Output of Vertical and Diagonal CCD . . . . . . . . . . . . . . . . . 21
2.16 CCD applied on two different images containing the same numerical . 21
2.17 CCD applied on two different images, showing their features . . . . . 22

3.1 The results of using various edge detectors on a given grey scale image
with the same threshold, 0.1 . . . . . . . . . . . . . . . . . . . . . . . 26
3.2 States of the CNN array at different times and the output states. . . 33
3.3 Results of applying the template set defined in equation 3.5 to two
different images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.4 Results of using template set given in equation 3.6, with different
threshold values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

vii
3.5 Results of using template set given in equation 3.6, on a gray-scale
image with varying noise levels . . . . . . . . . . . . . . . . . . . . . 36
3.6 Results of using template set given in equation 3.6, on a gray-scale
image with varying noise levels. . . . . . . . . . . . . . . . . . . . . . 37
3.7 Results of smoothing the image . . . . . . . . . . . . . . . . . . . . . 38
3.8 Results of adaptively smoothing the image . . . . . . . . . . . . . . . 38
3.9 Figure to illustrate the working of Dilation residue method of edge
detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.10 Result of thresholding the image with different threshold values . . . 39
3.11 Result of using different Sobel’s detectors on an image . . . . . . . . 40
3.12 Result of using the dilating Sobel’s horizontal and vertical edge de-
tection results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.13 Result of subtracting the dilated images from the original images . . 41
3.14 The final result of Dilation residue method for detecting edges . . . . 41
3.15 Problems with the output of the Dilation residue method . . . . . . . 42
3.16 Result of residue calculation using translation method . . . . . . . . . 42
3.17 The final edge map calculated using the translation residue method . 43
3.18 Comparison of results from the two proposed methods . . . . . . . . 44

4.1 Figure showing the input and the output images (and their corre-
sponding histograms) of histogram equalization . . . . . . . . . . . . 46
4.2 Input and output of ACIE method . . . . . . . . . . . . . . . . . . . 49
4.3 Plot showing a comparison of pixel values, in a row, between original
and enhanced image . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.4 Comparison between adaptively enhanced image and uniformly en-
hanced images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.5 Plot showing a comparison of pixel values, in a row, between original
and enhanced image . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.6 Result of the ACIE method . . . . . . . . . . . . . . . . . . . . . . . 54
4.7 Situation where the head lights make the surrounding environment dull 55
4.8 Result of enhancing figure 4.7 . . . . . . . . . . . . . . . . . . . . . . 55
4.9 Result of second pass (passing the result from the first step again as
input to enhancement engine) . . . . . . . . . . . . . . . . . . . . . . 56
4.10 Result of third pass . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.11 Result of fourth pass . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

5.1 Graph of a cell in Simulink . . . . . . . . . . . . . . . . . . . . . . . . 60


5.2 3x3 matrix convolution block . . . . . . . . . . . . . . . . . . . . . . 61
5.3 Subsystem from the 3x3 convolution block . . . . . . . . . . . . . . . 62
5.4 3x3 extractor subsystem . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.5 Building block of the 3x3 extractor block . . . . . . . . . . . . . . . . 64
5.6 Figure showing how the information flows after the convolution is
performed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

viii
5.7 Block implementing piece wise linear sigmoid function . . . . . . . . . 65
5.8 Model of a Cell in Simulink . . . . . . . . . . . . . . . . . . . . . . . 66

ix
List of abbreviations

µm micro meter

ADAS Advanced Driver Assistance Systems

CCD Connected Component Detection

CNN Cellular Neural Network

CNN-UM Cellular Neural Networks- Universal Machine

DSP Digital Signal Processor

DT-CNN Discrete Time Cellular Neural Network

FPGA Field Programmable Gate Array

HDL Hardware Description Language

HMI Human Machine Interface

KCL Kirchoff’s Current Law

KVL Kirchoff’s Voltage Law

MNS-CNN Multiple Neighborhood Size Cellular Neural Network

NUP-CNN Non-Uniform Processor Cellular Neural Network

PWL Piece Wise Linear

SC-CNN State Controlled Cellular Neural Networks

VCCS Voltage Controlled Current Source

x
Chapter 1

Introduction

1.1 Context of the work


This work is part of an ongoing project called “DriveSafe”, which targets at de-
veloping a multi-camera vision component for ADAS (Advanced Driver Assistance
Systems), allowing ADAS to operate on a comprehensive, dynamic and semanti-
cally interpreted view of the current traffic scene. The driver’s view is augmented
by fusing different camera views to one global view of the current traffic scene.
The scientific focus and key contributions for the project, DriveSafe, resides in
the following points: (a) exploitation of mobility in computer vision to speed up the
processing towards reaching real-time capabilities; (b) involvement of the “cellular
neural network” paradigm to perform appropriate nonlinear and adaptive image
processing to ensure robustness of the perception even under very severe visual
conditions; (c) the use of cooperative vision in order to enhance the reliability of
the scene perception; and (d) the development and use of an analogic (i.e., analogue
and logic) computing platform based in the core on a digital emulation of a CNN
- based universal machine implemented on system-on-chips platforms (e.g. FPGA),
beside some digital signal processor(s) (DSP) or Microcontroller(s), to significantly
speed-up the nonlinear image processing in order to reach the real-time constraints
of the practice.
Figure 1.1, shows a graphical representation of the cooperative computer vision
stack that consists of eight layers. The bottom layer I works on pixel level and is fed
by rough data from cameras and movement data from an inertial system. The data
from layer I are passed layer-by-layer up to the top-most layer VIII. In every layer,
the data is analyzed and transformed. Finally, the output of the highest layer is a
semantic interpretation of the current traffic scene which is presented through an
adequate HMI (Human Machine Interface) to the driver. For example, an output
of the topmost layer could be: “An occluded pedestrian will emerge in the next 3
seconds”. The HMI interface could warn the driver with an audio feedback like:

1
CHAPTER 1. INTRODUCTION 2

HMI

T e m p o ra l re la tio n sh ip s:
V I: T ra ffic S ce n e V- t
I- t: P re d ictio n
In te rp re ta tio n II-t: T ra ckin g
III- t: L e a rn in g
IV -t
In te r -laye r co m m u nica tio n V : M a p B u ild in g IV -t: M a p R e fin e m e n t
V: T ra ffic S ce n e P re d ictio n

IV : R e le va n ce F ilte rin g In te r -la ye r co m m u n ica tio n :


A b str a ctio n

E xp loita tio n o f kn o w le d g e fro m


III -t h ig h e r la ye rs in lo w e r la ye rs.
III : S e m a n tic- visio n

II- t R u d im e n ta l im p le m e n ta tio n;
II: C lu ste r-visio n ju st fo r p ro to typ e

I- t e In e rtia l
m
I: P ixe l-visio n Ti S yste m

CAM CAM ... CAM


1 2 n

Figure 1.1: Cooperative computer vision stack

“Pedestrian!” or with a warn signal.


From the bottom to the top data/information do experience a bandwidth (i.e.
data rate) decrease and an entropy increase. While in the bottom layer a huge
amount of pixel data with less information content (utilizable) needs to be processed,
the top-most layer delivers a description of the current traffic scene. The small
bandwidth at layer V and VI (compared to pixel level) enables a real-time map
exchange between all participating parties (at a relatively low data rate over a
wireless link), which would not be possible if the “huge” rough image data were
exchanged in real-time. However, the layer at which the cooperating vehicles will
exchange data will depend on the specific use case and on the quality of the data
exchanged. In fact, in some cases, data exchange may involve lower layers of the
protocol stack.
The growth of the stack into the image plane reflects the time between successive
captured frames. Due to the movement of road users, the inter-frame relationship
is strongly related to the time between two consecutive captured frames.
This work can be placed in stack I, hence the name Pixel vision. This work aims
at developing a couple of modules which deal with the images captured from the
cameras mounted in different directions on the vehicle. One of these modules help
in extracting relevant high level information which could be used by algorithms at
later stages. The other module, contrast enhancement module helps in augmenting
the drivers view of the current traffic scene, especially when the lighting conditions
are bad.
CHAPTER 1. INTRODUCTION 3

1.2 Motivation
According to the European Transport White Paper 2001 “transport by road is the
most dangerous and the most costly in terms of human life”. In 2000, 40.000 peo-
ple were killed in the European Union by road accidents, 1.7 million were injured.
The estimated indirect costs of road accidents are about EURO 160 billion. Only
a very small percentage of these accidents is due to technical errors. A human
being is the weakest part of the driving chain, because human senses have lim-
ited perception: and additional negative factors like stress, fatigue, distraction and
self-overestimation can significantly downgrade drivers performance.
This is where the DAS systems comes in. They help in augmenting the capa-
bilities of drivers, in the process decreasing the negative effects. There are many
methods to augment the drivers capability, mechanical systems like ABS, ESP, and
adaptive headlamps, computer vision systems, etc. Computer vision system can
perform tasks like enhancing the visibility, lane following, collision warning, dis-
tance estimation from another vehicle, etc. Inputs from cameras looking in different
directions can be put together to present a real time traffic situation surrounding
the vehicle for the driver. This could enhance his decision making regarding, lane
change, overtaking, etc. From these examples it can be shown that CV plays an
important role in the future ADAS.
The main problem using CV systems is its speed limitation. This major speed
limitation can be overcome by using the parallel processing capability of CNN. CNN
being an analogue parallel processor can process the information at a very high speed
when compared to traditional CV algorithms, which run serially. Processing at up to
10,000 frames per second [2] have been reported for primitive image processing tasks
like, edge detection, blurring, sharpening, etc. A 100x100 CNN array implemented
on a chip with deep sub-micron technology (0.33-0.25µm) has the same computa-
tional power of a supercomputer containing ≈ 9000 pieces of Pentium 200 MHz
processors. Apart from the aspect of speed, the flexibility of the CNN processor,
also motivates the use of CNN in image processing. Only parameters, constituting
the templates, are considered for designing CNN for a specific task.
CHAPTER 1. INTRODUCTION 4

1.3 Problem statement / Objectives


In the context of the project, DriveSafe, the basic tasks in image processing that
involve direct processing of the input image pixels, can be performed using CNN.
This is because these basic tasks consume a greater part of the overall time taken
for processing and extracting high level information from these raw images. Ma-
nipulating this high level information in the later stages can be achieved even with
serial processing.
These basic image processing tasks are coined as Pixel vision in this thesis. In
the context of this thesis the following modules are considered for implementation:

• Edge detection

• Contrast enhancement

• Implementing or emulating a CNN cell in Simulink

Edge detection and contrast enhancement are useful in the later stages of the al-
gorithm. Edge detection extracts the information regarding the edges in a picture.
This high level information about the image can be used in later stages of process-
ing like segmentation, lane markings detection, etc. Contrast enhancement helps to
improve the visibility for the driver in cases of low illumination or in cases of night
time driving, where light from the head lights of vehicles coming in the opposite
directions makes it difficult for the driver to perceive the road ahead. And the last
part of the work developing a model of CNN cell in Simulink, is helpful in the later
stages of implementation, when it is necessary to deploy the algorithm on an FPGA.

1.4 Major contributions of the thesis


The major contributions of this thesis are listed below:

• Presents a survey of the basics on CNN.

• A novel method for edge detection, translation residue method, was developed
and the results were compared with the results of normal image processing
techniques.

• Image enhancement procedure developed by Csapody et.al. [3] was imple-


mented on CNN and tested for different images. The results were compared
with results obtained by using photo editing softwares, like Photoshop.

• A model of the basic CNN cell has been developed in Simulink platform.
CHAPTER 1. INTRODUCTION 5

1.5 Thesis outline


This thesis is organized as follows:

• Chapter 2 gives the background on:

– Cellular neural networks


– Edge detection
– Contrast enhancement

This chapter also includes some methods that are already being implemented
in their respective fields.

• Chapter 3 deals with the proposed method for edge detection using CNN. Also
details are given regarding the problems faced while developing this method.

• Chapter 4 gives the details regarding the implementation of the image en-
hancement algorithm

• Chapter 5 concludes this report with a brief discussion on the results

• Appendix A: The first appendix gives a comparison between the linear and
nonlinear templates. This discussion brings to light the various factors which
lead to the proposal of nonlinear templates.

• Appendix B: This section reports the different possible ways of designing CNN
templates. These templates are the heart of CNN. This section is especially
very useful as it provides some basic knowledge.

The results for each module are discussed in the respective chapters.
Chapter 2

Background / Preliminaries

2.1 CNN basics


The concept of Cellular Neural Networks(CNN), also called Cellular Nonlinear Net-
works was introduced in 1988 by Leon O. Chua and Lin Yang [4, 5, 6]. The original
idea from Chua was to use an array of simple, nonlinearly coupled dynamic cir-
cuits to process large amounts of information in real time. The nonlinear analog
nature of CNN has been inspired from neural networks [7] and the architecture (reg-
ularly placed cells) of CNN has been inspired from cellular automata [8]. Also Chua
showed that his new architecture was able to efficiently perform time consuming
tasks, such as image processing and partial differential equation solution. Another
feature of CNN is its ease of implementation on VLSI. In the consequent sections,
basic concepts and definitions regarding CNN are discussed.

2.1.1 The CNN of Chua and Yang


In spite of many generalizations that followed, the original model of CNN from Chua
and Yang [4, 5, 6] remains the most widely used model. This can be attributed to
the fact that this model offers a good compromise between simplicity and versatility
and is also easy to implement.

The cell
The fundamental building block of CNN is the cell, also called an isolated cell. The
cell is a lumped circuit, containing both linear and nonlinear elements. Figure 2.1
shows the electrical structure of the cell as proposed by Chua and Yang in 1988.
The suffices u, x and y denote the input, state and output respectively. The node
voltages are represented as, Vu, Vx, Vy respectively for input state and output. 1 .
1
When the cell is connected in an array these representations are given with a suffix indicating

6
CHAPTER 2. BACKGROUND / PRELIMINARIES 7

Vu Vx Vy

E I C Ixu Ixy Iyx


Rx Ry

Figure 2.1: Electronic circuit model of an isolated cell

The output yij is a memory less nonlinear (piece wise linear) function of the state
xij . This dependency can be given as follows:
1
yij = f (xij ) = (|xij + 1| − |xij − 1|) (2.1)
2
Except for Iyx , (which is a Voltage Controlled Current Source), all the other
elements in the basic cell are linear.

Iyx = (1/Ry ).f (vxij ); (2.2)


The cell also contains an independent current source, I, called Bias and a group
of VCCSs, Ixu , Ixy . It is assumed that |xij (0)| ≤ 1, (initial condition constraint)
and that the input, obtained by the independent voltage source E, is constant and
|uij | ≤ 1 (input constraint)
At this point, exploiting the well known Kirchoff’s laws (KVL and KCL), the
circuit equations can be expressed as:
dVx −1
C = ∗ Vx + I + Ixu + Ixy (2.3)
dt Rx
Equation 2.3 is the State equation and equation 2.1 is the output equation of the
cell. These two equations completely define the basic cell.

The CNN array


A CNN is an array of basic cells. Each cell is coupled only with its neighbors, i.e
the cells are connected locally. Adjacent cells can interact directly with each other
while more distant cells can influence each other indirectly by propagation. Thus
we can say that the cells are globally connected. A typical 4x4 CNN is as shown in
figure 2.2.
The squares represent the cells and the links between these squares represent
the interconnections between the cells. The generic cell place on the i th row and
the position of the cell, for example, U is represented as Uij and the node voltage Vu is represented
as Vuij
CHAPTER 2. BACKGROUND / PRELIMINARIES 8

C(1,1) C(1,2)

C(2,1)

C(i,j)

Figure 2.2: A 4x4 array of cells

j th column is denoted as C(i,j). Also this model proposed by Chua and Yang
assumes that the CNN is space-invariant, i.e. all the cells in the CNN have the
same properties. The CNN model as proposed by Chua and Yang is generalized in
section 2.1.2.
Connections between the cells shown in figure 2.2, may not be limited to the
immediate cells. The extent to which the ’direct’ connections between the cells is
possible are limited by r-neighborhood set.

Definition 2.1.1 (r-neighborhood [9]) The r-neighborhood of C(i,j) is:

Sij (r) = {C(k, l) | max |k − i|, |l − j| ≤ r, 1 ≤ k ≤ M; 1 ≤ l ≤ N} ,


(2.4)
where r ∈ ℵ − {0} is the radius, and M and N are the dimensions of the array.

Examples of neighborhoods of same cell (represented in blue in figure 2.3) for


r = 1, 2, 3 are shown in Fig. 2.3. The coupling between the center cell Cij and the
cells belonging to its neighbor set are obtained by means of the linear VCCSs as
mentioned in Sec. 2.1.1. If an isolated cell is represented as shown in figure 2.4, then
the connections between its neighboring cells and the cell itself can be as shown in
figure 2.5 (in this figure only connections between the inputs of the cells is shown,
there are also connections between the outputs and states.)
From the figure 2.5, we can observe that the center cell is denoted as Cij and
the neighboring cells as Ckl . This is a general convention to differentiate between
neighboring cells and the center cell. The connections present between the cells are
called Synapses. Since we are giving all inputs of neighboring cells to the input of
center cell, these connections are termed as Feed-forward synapses. So, the resultant
input to the center cell may be given as:
X
bkl ∗ ukl (2.5)
kl
CHAPTER 2. BACKGROUND / PRELIMINARIES 9

Figure 2.3: The cells which come under r-neighborhood set of the blue cell when r
= 1,2,3 respectively

Input
Uij Output
State
Xij Yij
Zij
Bias
Cell
Cij
Figure 2.4: Representation of an isolated cell

State State State


Xkl Xkl Xkl

State State State


Xkl Xij Xkl

State State State


Xkl Xkl Xkl

Figure 2.5: A cell connected to its neighbors


CHAPTER 2. BACKGROUND / PRELIMINARIES 10

State State State


Xkl Xkl Xkl

State State State


Xkl Xij Xkl

State State State


Xkl Xkl Xkl

Figure 2.6: Figure showing Feedback synapses

The same way we also have Feedback synapses, as shown in figure 2.6, which is
obtained by connecting all the output signals of neighboring cells to the center cell.
The contribution of this feedback synapse to the center cell is:
X
akl ∗ ykl (2.6)
kl

The terms akl and bkl are the weights of the connections, called the Synaptic weights.
Once the cell is connected in an array the representation of its electronic circuit
changes as shown in Figure 2.7 and now it is called a Non-isolated cell. The terms
which are marked with blue color indicate the changes that take place when a cell
is connected in a neighborhood. In summary, the coupling between Cij and the

Vuij Vxij Vyij

I(ij, kl) I(ij, kl)


+ xu xy

E I C Iyx
Rx Ry

Figure 2.7: Electronic circuit model of the non-isolated cell


CHAPTER 2. BACKGROUND / PRELIMINARIES 11

cells belonging to its neighbor set Nr (ij) is obtained by means of the linear VCCSs
Ixu (i, j; k, l), Ixy (i, j; k, l). In fact, the input and output of any cell C(k, l) ∈ Nr (i, j)
influence the state xij of Cij by means of two VCCSs, defined by the equations:

Ixy (i, j; ; k, l) = A (i, j; ; k, l) .vykl , (2.7a)


Ixu (i, j; ; k, l) = B (i, j; ; k, l) .vukl , (2.7b)

It is important to note here that the coupling between the cells in the CNN is local.
This restriction is very important for the feasibility of hardware implementation. As
previously mentioned, however, cells that do not belong to the same neighborhood
set can still affect each other indirectly because of the propagation effects of the
continuous-time dynamics of the network.
After performing the nodal analysis, circuit equation for a non-isolated cell is
expressed as:
 
dvxij (t) 1 X X
Cx =− vxij (t)+Zij + (aij,kl )v ykl (t)+ (bij,kl vxij (t)+Iij
dt Rx
C(k,l)∈Nr (i,j) C(k,l)∈Nr (i,j)
(2.8)
The term Iij is the constant bias value. Output equation remains the same as in
equation 2.1, since the output depends only on the state of the present cell. As can
be seen from the Equation 2.8, the number of cells having the influence on the state
equation of a given cell are limited by the number of cells in the neighborhood set,
Nr . Thus we have Nr number of values for aij,kl and for bij,kl . These values, synaptic
weights, completely define the behavior of the network with given input and initial
conditions. These values are called the Templates. For the ease of representation,
they can be represented as a matrix. There are three types of templates:
• Feed-forward template or Control template
• Feedback template
• Static template or the Bias.
Since all these templates are space invariant we call them Cloning Templates. From
the equation 2.8 it can be seen that the template coefficients are completely defining
the behavior of the network with given input and initial condition. This means, if
a particular set of templates corresponds to a certain behavior with a given initial
condition and input then it can be inferred that if the set of templates are changed
we also have a change in the behavior of the system for the same input and initial
conditions. These templates are often expressed in compact form by means of tables
or matrices. For example the following two square matrices are used for a CNN with
r =1:
 
A (i, j; ; i − 1, j − 1) A (i, j; ; i − 1, j) A (i, j; ; i − 1, j + 1)
A =  A (i, j; ; i, j − 1) A (i, j; ; i, j) A (i, j; ; i, j + 1)  (2.9)
A (i, j; ; i + 1, j − 1) A (i, j; ; i + 1, j) A (i, j; ; i + 1, j + 1)
CHAPTER 2. BACKGROUND / PRELIMINARIES 12

 
B (i, j; ; i − 1, j − 1) B (i, j; ; i − 1, j) B (i, j; ; i − 1, j + 1)
B =  B (i, j; ; i, j − 1) B (i, j; ; i, j) B (i, j; ; i, j + 1)  (2.10)
B (i, j; ; i + 1, j − 1) B (i, j; ; i + 1, j) B (i, j; ; i + 1, j + 1)
This representation of the template elements enables us to write the state equa-
tion of CNN in a more compact form by means of two-dimensional convolution
operator, ’*’. The state equation 2.8 can be rewritten as:
dvx (t)
 
Cx dtij = − R1x vxij (t) + A ∗ vyij (t) + B ∗ vuij + I,
(2.11)
1 ≤ i ≤ M, 1 ≤ j ≤ N

The values of Cx , Rx can be conveniently chosen by the designer. The product,


CRx determines the rate of change of the dynamics of the circuit. For the sake of
convenience the normalized/dimensionless form of the dynamics of the array is given
as:
dxij (t)
= −xij (t) + A ∗ yij (t) + B ∗ uij + I,
dt (2.12)
1 ≤ i ≤ M, 1 ≤ j ≤ N
CHAPTER 2. BACKGROUND / PRELIMINARIES 13

2.1.2 Main generalizations


Many generalizations have been made to the basic model of CNN proposed by Chua
and Yang. The purpose of these generalizations have been to enhance the capabilities
of CNN, broaden their field of application or to improve the efficiency of the existing
ones [10, 11, 12]. Some of the most significant generalizations are mentioned here
and a more general definition for CNN including most of the particular cases will
also be given in the next section.

Nonlinear and delay CNNs[12]


The CNN model of Chua and Yang was indeed a nonlinear circuit because of the
presence of the output function (2.1). But some authors [13] refer to it as linear CNN
to emphasize the linearity of VCCSs, which determine the coupling and thereby, the
way the CNN processes the input. So, in order to truly make the CNN a nonlinear
processor, the template elements (synaptic weights) should be nonlinear functions.
So, the relationships 2.7a,b are written as:

Ixy (i, j; k, l) = Âij;kl (vykl , vyij ) + Aτij;kl vykl (t − τ ) , (2.13a)


τ
Ixu (i, j; k, l) = B̂ij;kl (vukl , vuij ) + Bij;kl vukl (t − τ ) , (2.13b)

where Âij;kl (., .), B̂ij;kl (., .) : C (R) × C (R) → R (i.e. they are real-valued contin-
uous functions of, at most, two variables) while Aτij;kl , Bij;kl τ
∈ R, τ ∈ [0, ∞). A
nonlinear coupling is introduced by Âij;kl and B̂ij;kl , while a functional dependence
is introduced by Aτij;kl and Aτij;kl . Now, it can be seen that the equations 2.7a, 2.7b
are special cases of equations 2.13a and 2.13b respectively. Also the state equation,
defined in 2.12 changes to:
dxij
dt
= −xij (t) + Â ∗ yij (t) + Aτ ∗ yij (t − τ ) + B̂ ∗ uij (t) + B τ ∗ uij (t − τ ) + I, ,
1 ≤ i ≤ M, 1 ≤ j ≤ N
(2.14)
Another observation which could be made after comparison of equation 2.14 with
equation 2.11 could be the absence of time-invariance of the input. This also helps
in making the CNN a more general analog processing paradigm. When the delay
terms Aτ ∗ uij;kl , B τ ∗ uij;kl are zero then the array becomes a pure nonlinear CNN
and when the nonlinear terms  ∗ yij;kl , B̂ ∗ yij;kl are zero then the array is a delay-
type CNN. The purpose of introducing nonlinear templates is discussed in detail in
A.
Other extensions relate to the memory less output equation 2.1. An arbitrary
bounded nonlinear function f : R → R can be used in place of equation 2.1. Some
possible output functions are shown in Fig. 2.8
CHAPTER 2. BACKGROUND / PRELIMINARIES 14

(a) (b) (c)

(e) (f )
(d)

Figure 2.8: Nonlinear output functions; (a) gaussian, (b) inverse gaussian, (c) unity
gain with saturation, (d) inverse sigmoid, (e) sigmoid, (f) high gain with saturation

Non-uniform processor CNNs and multiple neighborhood size CNNs


Motivated partly by the neurobiological structures, other generalizations have been
proposed. They include nonuniform grid CNN. These include grids with more than
one type of and/or more than one size of neighborhood. These are called Nonuni-
form processor CNNs (NUP-CNN) and Multiple-neighborhood-size CNNs respec-
tively. Examples are depicted in figures 2.9 and 2.10
In figure 2.9 two types of cells, one in white and the other one in blue are present
in the same layer. This is an example for Non-Uniform Processor CNN. Figure 2.10,
represents a Multiple Neighborhood-size CNN with two sizes of neighborhood. Here
all the cells are the same, but the white cells belong to Layer-A and the blue cells
belong to the second layer, Layer-B. The grid on layer A is a fine grid with r = 1,
while the grid on layer B is a course grid with r = 3. In the figure connections to
only one cell in layer B are shown (blue lines). It is also possible to have different
type of grids. In every type of grid, there is the possibility of having NUP-CNN
and MNS-CNN. Some types of grids inspired from Biological systems are shown in
Figure 2.11
There are also other generalizations like Discrete-Time CNNs. These DT-CNNs
have many practical features which are not possessed by conventional continuous
time CNN. These include some appreciable robustness features. Another feature
is the ability to control the speed of the network. This controllability is obtained
CHAPTER 2. BACKGROUND / PRELIMINARIES 15

Figure 2.9: A NUP-CNN with two kinds of cells

Figure 2.10: A MNS-CNN with two kinds of neighbor sets


CHAPTER 2. BACKGROUND / PRELIMINARIES 16

(b)
(a)

(c)

Figure 2.11: Examples of Grids; (a) triangular, (b) hexagonal, (c) rectangular

by adjusting the clock. But on the other hand, some of the features of Continuous
systems are lost as it is always the case when a continuous system is converted to a
discrete system. For example, in a DTCNN, a symmetric template does not mean
complete stability.

2.1.3 A formal definition


After going through the evolution process of CNN we can now give a formal definition
for cellular Neural Networks. Here the last and most general definition of cellular
neural network is reported. These definitions were quoted from [9]

Definition 2.1.1 (Cellular Neural Networks) A cellular neural network (CNN)


is a high dimensional dynamic nonlinear circuit composed by locally coupled, spatially
recurrent circuit units called ’cells’. The resulting net may have any architecture,
including rectangular, hexagonal, toroidal, spherical and so on. The CNN is defined
mathematically by four specifications:

1. Cell dynamics.

2. Synaptic law.

3. Boundary Condition.

4. Initial conditions.
CHAPTER 2. BACKGROUND / PRELIMINARIES 17

Definition 2.1.2 (Cell dynamics) The internal circuit core of the cell can be any
dynamical system. The cell dynamics is defined by an evolution equation. In the case
of continuous-time lumped circuits, the dynamics are defined by the state equation:

ẋα = −g(xα , zα , uα (t), Iαs ), (2.15)

where xα , zα , uα ∈ Rm are the state vector, threshold and input vector of the cell nα
at position α respectively. Iαs is a synaptic law and g : Rm × Rm × Rm × Rm → Rm
is a vector field.

Definition 2.1.3 (Sphere of influence) The sphere of influence Sα of the cell nα


coincides with the previously defined (Def: 2.1.1) neighbor set Nr without nα itself
(Each cell in Cellular neural network is, by definition, coupled locally only to those
neighbor cells which lie inside a prescribed sphere of influence of radius r):

˙ r (nα ) − nα
Sα =N (2.16)

Definition 2.1.4 (Synaptic law) The synaptic law defines the coupling between
the considered cell nα and all the cells nα+β within a prescribed sphere of influence
Sα of nα itself:

Iαs = Âβα xα+β + Aβα ∗ fβ (xα , xα+β ) + Bαβ ∗ uα+β (t), (2.17)

The first term is the linear feedback of the states of the neighboring cells. The second
term is the nonlinear feedback template- The last term accounts for the contribution
of external inputs. Bαβ is the feed forward or control template.

Before going to the definition of boundary conditions, let us look at the definition
of boundary cells and their distinction with regular cells.
Definition 2.1.5 (Regular and boundary cells) A cell C(i, j) is called a regu-
lar cell with respect to Sr (i, j) if and only if all neighborhood cells C(k, l) ∈ Sr (i, j)
exist. Otherwise C(i, j) is called a boundary cell (figure 2.12).

Definition 2.1.6 (Boundary conditions) The boundary conditions are those spec-
ifying ykl and ukl for cells belonging to Sr (i, j) of edge cells but lying outside of the
M × N array.

Due to the property of indirect propagation, the boundary cells also can affect the
whole network behavior.
Definition 2.1.7 (Initial conditions) The initial conditions are those specifying
the initial state, xij (0), and the initial value of the output, yij (0), for all the cells,
both boundary cells and the regular cells.
Note: The initial state could also be taken as another image. This is particularly
useful in cases where temporal relation between two images have to be analyzed.
CHAPTER 2. BACKGROUND / PRELIMINARIES 18

Boundary cells for ‘r=1’

C(i,j)

Figure 2.12: Boundary cells and regular cells

2.1.4 Applications of CNN


In the previous sections we have seen what CNN is, and the properties of CNN. In the
coming section, some typical applications2 are presented to establish the importance
of cellular neural networks in the field of image processing. These examples also help
in understanding the dynamic behavior of CNN.

Image processing application


Most of the applications developed in the field of CNN come under Image process-
ing. The striking attention devoted to CNN in the field of image processing can
be attributed to the fact that CNN is a parallel processor. Since the information
is processed simultaneously by cells in an autonomous way (i.e. principle of paral-
lelism) it takes very less time for processing the given input image. The speed at
which the operation takes place mostly depends on how fast the system achieves
its stable permanent phase. Even with this limitation, the speed at which a given
operations is performed on CNN is very fast when compared to a normal image
processing task. Frame rates of around 300 frames per second were reported for
complex image processing tasks like, face detection and tracking [14] and for simple
tasks like particle detection in jet engine fluid the frame rates achieved were greater
2
The examples presented here are in no way extensive. But these are the basic applications,
which also help us in understanding the features of CNN
CHAPTER 2. BACKGROUND / PRELIMINARIES 19

Figure 2.13: The way the templates act on an image

than 10,000 frames per second.


A pictorial representation of how the templates act on an image is shown in Fig.
2.13. This shows that first, template ’B’ acts on the input image, then template ’A’
acts on the output from the first step of iteration. The static template ’I’ is also
shown. This template acts at every step of iteration in the processing.
In this section3 , a simple example involving the processing of a binary image is
presented. The task is to recognize numbers, presented as images. Generally in any
case where we need to recognize something, first we need to differentiate it from
others of the same kind. For performing this operation we look for features that
could distinguish it from others. Here, the afore mentioned procedure is followed here
to recognize various numbers. For the purpose of detecting features in a particular
image, we use connected components. There are three templates defined in [15],
which gives the number of connected components in three different directions, in a
given image. They are:
3
A pseudo code is presented in appendix C which is used for the generation of Matlab code.
All the results shown in this thesis are generated using this code
CHAPTER 2. BACKGROUND / PRELIMINARIES 20

   
0 0 0 0 0 0
A=  1 2 −1 , B =
  0 0 0  , I = 0; (2.18a)
0 0 0 0 0 0
   
0 1 0 0 0 0
A=  0 2 0 ,B =
  0 0 0  , I = 0; (2.18b)
0 −1 0 0 0 0
   
1 0 0 0 0 0
A=  0 2 0  ,B =  0 0 0  , I = 0; (2.18c)
0 0 −1 0 0 0

Equation 2.18, gives the templates for Horizontal, vertical and diagonal CCD (Con-
nected Component Detector). The purpose of these templates is to find the number
of connected components in the horizontal, vertical and diagonal directions for a
given image.
Let us try applying one of these template to a binary image. Also we need to
give the following details in order to simulate the template set.

• Input image

• Initial state, a matrix of the same size as the input image. Generally in many
cases the initial state is zero

• Time of simulation

• Step size for simulation

The result of applying the template given in equation 2.18a on the image shown in
figure 2.14(a) is given in figure 2.14(b). The result of applying template set given
in equation 2.18b and equation 2.18c are shown in figure 2.15.

(a) Input Binary (b) Output of


image applying Horizontal
CCD

Figure 2.14: An example of Horizontal Connected Component Detector


CHAPTER 2. BACKGROUND / PRELIMINARIES 21

(a) Output of (b) Output of


applying Vertical applying Diagonal
CCD CCD

Figure 2.15: Output of Vertical and Diagonal CCD

(a) CCD result on an image (b) CCD result on another image

Figure 2.16: CCD applied on two different images containing the same numerical

The outputs of applying CCD for a particular input (here image of a numerical)
are different when compared to the outputs for images containing a different number.
These features are used to identify numbers. We can also see from figure 2.16, that
even when the number is written in a different way, we see that they have almost
the same features. This way it can also be used to recognize the hand written text.
Figure 2.17 shows the results of CCD applied in various directions, to two other
different input images (images with numerical ’5’ and ’7’). After comparing the
results we can see that there are different features for different input images. This
way different numerical (numbers) can be recognized.
CHAPTER 2. BACKGROUND / PRELIMINARIES 22

Figure 2.17: CCD applied on two different images, showing their features

2.2 Summary
This second chapter gives details on the basics of Cellular Neural Networks. The
first section deals with the definition of a cell. This section explains how the basic
cells are connected in an array to form a Cellular Neural Network. The concept of
templates is explained through the description of the connections within an array.
The following section describes generalizations that help CNN to become a general
nonlinear processing array. Some of these generalizations are inspired from biological
elements like cat retina, etc. A formal definition for Cellular Neural Networks is
presented.
Some examples in the field of image processing are presented.
Chapter 3

Edge detection using CNN

This chapter gives a brief introduction to what edges are and their importance in
image processing, followed by some traditional methods available for detecting edges.
A review of templates for detecting edges in a gray-scale image on CNN platform is
also presented. A particular attention is devoted to the novel method proposed for
detecting the edges. A qualitative comparison has been made between the results
of the proposed method and the results of traditional edge detection methods.

3.1 Edge detection and its importance


Edge Detection is one of the basic, yet very important image processing task, be-
cause the success of higher level processing relies heavily on how good the edges are
detected. Normal images contain enormous amounts of data. For fast and efficient
processing of the images, the amount of data has to be decreased, i.e. only most im-
portant information has to be processed and the rest has to be left out. This is what
is being achieved by Edge Detection. However detection of edges in a real-scene is
still a problem. This might be because of the speed limitations or the complexity in
the image. A program which has to detect edges should be able to strike a balance
between these two factors namely, good detection of relevant edges and time taken
to detect these edges.
To detect the edges, we should know what the edges are. From a theoretical
point of view, an edge is a step or a slope between two uniform luminance areas.
Unfortunately, edges in real scenes rarely comply with this statement. An efficient
edge detector should comply with some basic requirements: edges should be found
with a low probability of false detection due to noise; edges should not be misplaced;
algorithms should perform similarly with any image; algorithms should allow an
efficient implementation [16, 17]

23
CHAPTER 3. EDGE DETECTION USING CNN 24

3.2 How to judge the presence of an edge


So if we are able to detect changes in the brightness of the image along a line, then
we can detect the edges. For finding the changes in intensity we need to calculate the
difference of pixels along a direction. If this difference is greater than a threshold,
then we can assume the presence of an edge. This way if the edges in both horizontal
and vertical directions are found and are summed up, then an edge map of the input
image can be created.

3.3 Traditional methods for edge detection and


their limitations
Most of the traditional methods available for edge detection have their basic idea
derived from the concept of derivatives. This is called the Derivative method for
detecting edges.
The definition of derivative is like this:
f (x + ) − f (x)
f˙(a) = lim (3.1)
→0 
From the equation we can see that the derivative is in-fact a difference of two
consecutive values over another difference. So, this can be used to estimate the
edges in an image. Another important information to be noted here is that, we
can have edges computed in both directions in an image, horizontal and vertical.
So, if the information changes in both directions, then we use partial derivatives to
estimate the edges. There is an estimate to calculate the value of partial derivative
of a function and this is shown below in equation 3.2

∂f f (x + , y) − f (x, y)
= lim ; (3.2)
∂x →0 
We might also estimate the partial derivative as a symmetric difference:
∂h
≈ hi+1,j − hi−1,j (3.3)
∂x
The result of this symmetric difference on an image will be equivalent to the result
of convolving the image with a kernel given below:
 
 0 0 0 
kernel = 1 0 −1
0 0 0
 

So, this could be taken as a basic to detect the edges in an image. This could be
further improved to detect the edges. In this respect some edge processing kernels
CHAPTER 3. EDGE DETECTION USING CNN 25

were proposed. These kernels were named after the ones who proposed them:
   
 −1 0 1   1 1 1 
Prewitt’s kernel: Mx = −1 0 1 ; My = 0 0 0
−1 0 1 −1 −1 −1
   
   
 −1 0 1   1 2 1 
Sobel’s kernel: Mx = −2 0 2 ; My = 0 0 0
−1 0 1  −1  −2 −1
   
0 1 1 0
Robert’s kernel: Mx = ; My =
−1 0 0 −1
The concept of convolution is considered in the CNN platform developed in this
work. Therefore we can use these image processing kernels as templates for the
CNN platform. The idea here is to use these kernels as the control template in the
CNN. This way the template is convoluted over the image. The complete template
set used for using Prewitt’s horizontal kernel is:
   
0 0 0 −1 0 1
A =  0 0 0  ; B =  −1 0 1  ; I = 0; (3.4)
0 0 0 −1 0 1
The result of using this template set on a grey scale image are presented in section 3.5
These traditional methods suffer from a limitation, that they are very slow.
This slow nature can be attributed to the way a traditional processor processes the
information, in parallel. Apart from the kernels above mentioned, the edge detection
process also involves complex mathematical steps to threshold the edge map, which
further increase the time for processing the image.

3.4 Review of the different approaches for edge


detection on CNN platform
Below are some basic templates defined in the field of CNN for detecting the edges.
Edge detection on Binary images can be achieved by using the following template
set as given in [18]
Template set for Binary Edge detection
   
0 0 0 −1 −1 −1
A=  0 0 0  , B =  −1 8 −1  , I = −1; (3.5)
0 0 0 −1 −1 −1

Input: Static binary image.


Initial conditions: Initial condition is arbitrary. Here it is taken to be zero.
Boundary conditions: Fixed type boundary condition, uij = yj = 0.
Output: Output is also a binary image, y(t) ⇒ y(∞)
CHAPTER 3. EDGE DETECTION USING CNN 26

(a) The input grayscale (b) Result of Canny edge (c) Result of Sobel edge
image detector detector

(d) Result of Prewitt’s edge (e) Result of Robert’s edge


detector detector

Figure 3.1: The results of using various edge detectors on a given grey scale image
with the same threshold, 0.1
CHAPTER 3. EDGE DETECTION USING CNN 27

Local rules The local rules that are followed to obtain the required result are given
as:
Static input uij → steady state output yij (∞)

1. white pixel → white, independent of neighbors


2. black pixel → white, if all neighbors are black
3. black pixel → black, if at least one nearest neighbor is white
4. black, gray or white pixel → gray, if nearest neighbors are gray

Using template shown above the results obtained at different time steps are shown
in figure 3.2
This template set acts well on different types of binary input images as shown
in figure 3.3. This shows that the template set defined in equation 3.5 works good
for images with varying complexity.
After looking at the binary edge detection template set, here another template
set which detects edges in a gray scale image is presented [18].

Template set for Gray-scale Edge detection


   
0 0 0 −1 −1 −1
A =  0 2 0  , B =  −1 8 −1  , I = −0.5; (3.6)
0 0 0 −1 −1 −1

Input: Static gray-scale image.


Initial conditions: Initial condition is taken to be zero.
Boundary conditions: Fixed type boundary condition, uij = yj = 0.
Output: Output is a binary image, at y(t) ⇒ y(∞)

Some interpretations that could be drawn from the images shown in Figures 3.4
and 3.5 are listed here:

The results are sensitive to threshold: Envisage a situation where we have to


find the edges in an image sequence, then depending on the complexity of the
frame, a threshold has to be set. This is because the amount of information
that appears in the output depends on the threshold.

Sensitivity to noise: This is a situation especially when the threshold values are
low. As the amount of noise in an image increases, the amount of edge infor-
mation that is detected with a lower threshold also increases.

From the conclusion presented above it has become clear that noise should be
eliminated before applying the templates. One easy way to reduce the effect of noise
on the output of edge detection is to use a low-pass filter on the image. That is
to use smoothing as a preprocessing step. The images taken as input in figure 3.5
CHAPTER 3. EDGE DETECTION USING CNN 28

when given to this preprocessor, which smooth out the image, output is shown in
figure 3.6.
The conclusion that could be drawn after looking at the results presented in
figure 3.6 is:

• Even though the quality of output looks better in the resulting image, when
a gray-scale edge detector is used on a smoothed image, there are some edges
which are missing. This is due to the fact that image smoothing, in addition
to decreasing the effect of noise, smooth out the sharp edges present in the
image. So, this might in some cases reduce the information that is extracted
out of the original image.

So, the solution for the problem of avoiding noise could be to implement a smoothing
algorithm which could take care of edges, i.e. an algorithm which could preserve the
edges while smoothing out the image.
In case of normal image processing, tools like spatially variant blurring, were
developed by Gábor [19]. Lev et al. proposed some iterative weighted averaging
methods [20]. They use a weighted mask whose coefficients are based on the evalu-
ation of the differences between the central and neighboring pixels. Another work
by Perona and Malik, proposed anisotropic diffusion for adaptive smoothing to for-
mulate the problem in terms of the nonlinear heat equation [21].
This was what exactly Rekeczky et al. have done it using CNN [22]. They
proposed some Dynamic Difference Controlled Nonlinear (DDCN) templates, which
could solve this problem of smoothing the image adaptively so that the sharp edges
remain sharp even after smoothing out the noise. One of the DDCN template set is
given in equation 3.7
 
0 Φ 0 
1 − |∆vxx |/2K, if |∆vxx | ≤ 2K
A = 1; Ĉ =  Φ 0 Φ  ; Φ = g∆vxx , g = , I = 0;
0, otherwise
0 Φ 0
(3.7)
where, ∆vxx = vxkl (t) − vxij (t) and Ĉ is the nonlinear term controlled by ∆vxx .
Since the term Ĉ depends on a difference which changes dynamically with time, this
term is called Dynamic difference controlled nonlinear term. The results that were
proposed in the paper were given in Figures 3.7 and 3.8
But because of the time constraints, this anisotropic diffusion could not be sim-
ulated in Matlab. The problem is simulating the effects of nonlinear templates. So,
attention was given on looking for methods that does not involve the use of nonlinear
templates.
CHAPTER 3. EDGE DETECTION USING CNN 29

3.5 Dilation residue method


In 1987, James Lee et al. proposed the use of morphological operations to detect
edges in a gray-scale image [23]. They came up with the idea to use dilation, a
basic morphological operation, for detecting the edges. The details of the method
are mentioned below. As the name suggests, the final result of this method is the
residue that is left when the dilation operation is carried out on the image. The
difference of the result of dilation and the original image is the edge image. This
can intuitively be explained in depth in figure 3.9.
The problem with this method is the non availability of templates to perform
dilation on a gray-scale image. So, this method cannot be directly used in CNN. The
only available template for performing dilation operation can only do it on a binary
image. So, the next operation is to convert the gray-scale image into binary image.
For this purpose the gray-scale to binary threshold template shown in equation 3.8
is used. The result of using this threshold template can be seen from figure 3.10.
   
0 0 0 0 0 0
A =  0 2 0  ; B =  0 0 0  ; I = −I ∗ ; (3.8)
0 0 0 0 0 0
After going through these results, one can recognize that the edge structure from
the input image is not carried into the binary image. This means that, even if we
use an edge detection template on these results we do not get a good result. So, the
immediate requirements for this process to be carried on a CNN are:
• To find a template which can convert a gray-scale input image into a binary
image.
• At the same time the template should be designed taking care that the edge
structure in the input image must not be disturbed when the conversion is
made.
At this juncture it is worth looking back to section 3.3, where some image pro-
cessing kernels have been shown. Here it would be a good idea to use these kernels.
If these kernels can be modified into CNN templates so that the resulting image is
a binary image, then our problem could be solved. After trial and error, the tem-
plate sets that were developed are given in equation 3.9((a) template set for Sobel’s
horizontal kernel and (b) template set for Sobel’s vertical kernel) , and the results
were reported in figure 3.11.
   
0 0 0 1 0 −1
A=  0 2 0 ;B =
  1 0 −1  ; I = 0.5; (3.9a)
0 0 0 1 0 −1
   
0 0 0 1 1 1
A =  0 2 0 ;B =  0 0 0  ; I = 0.5; (3.9b)
0 0 0 −1 −1 −1
CHAPTER 3. EDGE DETECTION USING CNN 30

From the results show in figure 3.11, it is clear that the gray-scale input image
is converted into a binary image at the same time retaining the edge structure. So,
with these binary images, we can use Dilation-residue method. First task is to use
the dilation template on these images. Second task is to subtract them from the
corresponding original images. The template set that is used for performing the
dilation task is reported in equation 3.10, and the results are shown in figure 3.12
   
0 0 0 0 1 0
A =  0 1 0  ; B =  1 1 1  ; I = −4; (3.10)
0 0 0 0 1 0
After these results are obtained we can now subtract these images from the
images in figure 3.11. For this purpose a template set has been developed in equa-
tion 3.11
   
0 0 0 0 0 0
A =  0 1 0  ; B =  0 −1 0  ; I = −1; (3.11)
0 0 0 0 0 0
In equation 3.11 the feed-forward template, B, has ’-1’ at its center and all the
other elements are zero’s. Since we convolve the input image with the template B,
in this case just multiply the input image with the value ’-1’, the image which is
subtracted is given as input and the other image as the initial state. If an image ’X’
is subtracted from another image ’Y’, then ’X’ is given as input image to the CNN
and the image Y has to be given as the initial condition. The result of applying this
template to our case yields the outputs shown as images in figure 3.13
The final task once we have the edges in both horizontal and vertical directions
is, to club them together and generate the final edge image. For this purpose the
template set used was “Logical OR”. This template set is defined as follows [18]:
   
0 0 0 0 0 0
A =  0 3 0  ; B =  0 3 0  ; I = 2; (3.12)
0 0 0 0 0 0
with this template set there is no restriction on which image is given as input and
which image is given as initial state. The output of ORring the edge maps in
figure 3.13, is shown in figure 3.14
Comments: After looking at the result shown in figure 3.14, one can say that
mostly the edges that are present in the original input image are represented with
double lines in the output. In figure 3.15 these double lines are pointed out clearly.
This problem might be attributed to the structural element of dilation, i.e. the
value of template B in equation 3.10. This particular structural element can be
explained by the fact that for every pixel in the input image, pixels are added in all
the four directions in the output image. So, for example if there is a vertical line of
width one-pixel is present in the input image, then the resulting image has a vertical
line of width three pixels. Further, when the original image is removed from this
image, there would still be two one-pixel width lines in the output image.
CHAPTER 3. EDGE DETECTION USING CNN 31

3.6 Translation residue method


Figure 3.15 points out the ill-effects of the dilation residue method. As discussed
in the previous section, this effect of forming double edges can be attributed to the
type of structural element used. One solution to this problem could be the use of a
structural element which only allows for dilation in one direction. Another simpler
solution could be to move the image to one direction and then take the difference.
Fortunately there are templates already established [9] for translating the image
in any given direction (see equation 3.13)v. Since this method involves the use of
translation and then calculating the residue, it is named translation residue method.
   
0 0 0 b−1,−1 b−1,0 b−1,1
A =  0 1 0  ; B =  b0,−1 b0,0 b0,1  ; I = 0; (3.13)
0 0 0 b1,−1 b1,0 b1,1

The values of template B depend on the direction we want to translate the image.
In any case we only have one value of ’b’, to be 1 and all the other values to be
0. For example if we want to move the image in the East direction, then we have
the values of template B are defined in B(E) and B(W) for movement in the west
direction and so on:
     
0 0 0 0 0 0 0 0 0
B(E) =  1 0 0  ; B(W ) =  0 0 1  ; B(N ) =  0 0 0  ; and so on
0 0 0 0 0 0 0 1 0
Similarly we can formulate templates to move the given binary image in any desired
direction.
Once we have the image moved in a given direction, we can use equation 3.11, to
take the difference. The results of using this new method of translating and taking
the difference are shown in figure 3.16 and the result of combining both these images
using equation 3.12 is shown in figure 3.17. It can be clearly seen that the problem
with the double lines is completely avoided and also if clearly observed it can also
be seen that many edges that were not present in the previous result appear here.
But there are still some isolated points in the result which may be caused by the
noise. These small isolated points can be removed easily.
A comparison of the result using dilation residue method with the result using
translation residue method is shown in figure 3.18. The pointers in this figure show
the absence of double edges in the case of translation residue method. The original
image is also shown for qualitative comparison.
CHAPTER 3. EDGE DETECTION USING CNN 32

3.7 Summary
This chapter has discussed the basics of edge detection, various existing methods,
both traditional and CNN based for edge detection. The limits of these methods to
give a consistent edge map in case of varying complexity of the input is presented.
Some solutions that were already proposed by various research groups, using non-
linear templates, are presented. A classical method for detecting the edges through
morphological transformations, dilation residue method, has been presented and
a successful attempt was made to implement this technique in CNN. Using this
method, the disadvantages were put forward and a novel technique, translation
residue method, was presented to solve this problem. The results were compared
and a very good agreement was obtained between them.
CHAPTER 3. EDGE DETECTION USING CNN 33

(a) The input image (b) The initial state

(c) The state after two (d) The state after 5 iterations
iterations in matlab in matlab

(e) The state after 10 (f) The final stage after


iterations steps around 30 iterations

Figure 3.2: States of the CNN array at different times and the output states.
CHAPTER 3. EDGE DETECTION USING CNN 34

20 20

40 40

60 60

80 80

100 100

120 120

140 140

160 160
20 40 60 80 100 120 140 160 180 20 40 60 80 100 120 140 160 180

(a) The input image (b) The edges detected

20 20

40 40

60 60

80 80

100 100

120 120
20 40 60 80 100 20 40 60 80 100

(c) The input image (d) The edges detected

Figure 3.3: Results of applying the template set defined in equation 3.5 to two
different images
CHAPTER 3. EDGE DETECTION USING CNN 35

20 20

40 40

60 60

80 80

100 100

120 120

140 140

20 40 60 80 100 120 140 20 40 60 80 100 120 140

(a) The input gray-scale image (b) The edges detected with threshold
of ’-1’

20 20

40 40

60 60

80 80

100 100

120 120

140 140

20 40 60 80 100 120 140 20 40 60 80 100 120 140

(c) The output when threshold is (d) The output when threshold is ’-0.5’
’-0.01’

Figure 3.4: Results of using template set given in equation 3.6, with different thresh-
old values
CHAPTER 3. EDGE DETECTION USING CNN 36

50

100

150

200

250
50 100 150 200 250

(a) The input gray-scale (b) The edges detected with threshold
image of ’-0.5’ and a noise of ’2%’

50

100

150

200

250
50 100 150 200 250

(c) The output with the same


threshold, but with the noise level
increased to ’5%’

Figure 3.5: Results of using template set given in equation 3.6, on a gray-scale image
with varying noise levels
CHAPTER 3. EDGE DETECTION USING CNN 37

(a) The image when a uniform (b) Resulting image when the
noise of ’2%’ is added image in figure 3.6(a) is
blurred

(c) Image when a uniform (d) Resulting image when the


noise of ’5%’ is added image in figure 3.6(c) is blurred

50

100

150

200

250
50 100 150 200 250

(e) Result of applying edge detection template


on image shown in figure 3.6(d)

Figure 3.6: Results of using template set given in equation 3.6, on a gray-scale image
with varying noise levels.
CHAPTER 3. EDGE DETECTION USING CNN 38

Figure 3.7: Results of smoothing the image

Figure 3.8: Results of adaptively smoothing the image


CHAPTER 3. EDGE DETECTION USING CNN 39

Figure 3.9: Figure to illustrate the working of Dilation residue method of edge
detection

(a) The input image (b) Thresholding the


gray-scale image with a
threshold of ’0.5’

(c) Thresholded image with


’0.1’

Figure 3.10: Result of thresholding the image with different threshold values
CHAPTER 3. EDGE DETECTION USING CNN 40

(a) Result of using Sobel’s horizontal (b) Result of using Sobel’s vertical
edge detector edge detector

Figure 3.11: Result of using different Sobel’s detectors on an image

(a) Result of using dilation on the (b) Result of using dilation on the
result of Sobel’s horizontal edge result of Sobel’s horizontal edge
detector detector

Figure 3.12: Result of using the dilating Sobel’s horizontal and vertical edge detec-
tion results
CHAPTER 3. EDGE DETECTION USING CNN 41

(a) Detected edges in the horizontal (b) Detected Edges in vertical


direction direction

Figure 3.13: Result of subtracting the dilated images from the original images

50

100

150

200

250
50 100 150 200 250

Figure 3.14: The final result of Dilation residue method for detecting edges
CHAPTER 3. EDGE DETECTION USING CNN 42

Figure 3.15: Problems with the output of the Dilation residue method

(a) Detected edges in the horizontal (b) Detected Edges in vertical


direction direction

Figure 3.16: Result of residue calculation using translation method


CHAPTER 3. EDGE DETECTION USING CNN 43

Figure 3.17: The final edge map calculated using the translation residue method
CHAPTER 3. EDGE DETECTION USING CNN 44

(a) Result of Dilation residue method (b) Result of Translation residue


method

Figure 3.18: Comparison of results from the two proposed methods


Chapter 4

Contrast enhancement using CNN

4.1 Contrast enhancement and traditional meth-


ods for enhancing the contrast
Contrast enhancement, as the name suggests is a type of Image enhancement pro-
cedure, to improve the visual appearance of an image. There are several available
methods like amplitude scaling, contrast modification, and various kinds of his-
togram modifications [24]. Among the various methods, Histogram1 equalization is
the most commonly used and the most intuitive one. In layman’s terms contrast
means difference, and enhancement means to increase. So, for an image, contrast
enhancement means increasing the difference between the intensity of pixels. If the
difference between the intensities of pixels is increased, then two pixels with close
enough intensities in the input image can be distinguished easily in the enhanced
image.
So formally speaking, the Histogram equalization is a technique which re-scales
the range of an images pixel values to produce an enhanced image whose pixel values
are more uniformly distributed throughout the whole range of intensities possible.
An example to show histogram equalization is given in figure 4.1. The output
shown in figure 4.1 has been generated using Adobe Photoshop.
We can see from figure 4.1(c) and figure 4.1(d), that there are pixels with all the
possible intensity values in the output. Another inference that could be drawn from
the output is that some regions of the input which are darker have been made more
visible. This means that some information which was previously not visible could
be seen after the enhancement. But we can also see that the brightest parts of the
input image have been given more brightness as a result of which the perception in
those regions have decreased. In-order to deal with this situation there are methods
1
Histogram is a graph showing the number of pixels in an image at each different intensity value
found in that image.

45
CHAPTER 4. CONTRAST ENHANCEMENT USING CNN 46

(a) The input Gray-scale image (b) Enhanced image using Photoshop from
ADOBE

0.025 0.025

0.02 0.02

0.015 0.015

0.01 0.01

0.005 0.005

0 0
0 50 100 150 200 250 300 0 50 100 150 200 250 300

(c) Result of Histogram equalization using (d) Resulting equalized histogram


MATLAB

1 1

0.9 0.9

0.8 0.8

0.7 0.7

0.6 0.6

0.5 0.5

0.4 0.4

0.3 0.3

0.2 0.2

0.1 0.1

0 0
0 50 100 150 200 250 300 0 50 100 150 200 250 300

(e) Cumulative histogram of input image (f) Cumulative histogram of output image

Figure 4.1: Figure showing the input and the output images (and their corresponding
histograms) of histogram equalization
CHAPTER 4. CONTRAST ENHANCEMENT USING CNN 47

like local histogram equalization, histogram modifications, low-pass filter, etc. [25]

4.2 Additive image enhancement using contrast


and intensity
As discussed in section 4.1 the most popular method for enhancing the contrast
is the Histogram equalization method. The aim in this section is to enhance the
contrast of an image using CNN. Adaptive Histogram equalization methods have
been used by Csapody et al. [3]. But there are simpler methods introduced by
Brendel et al. based on the intensity and contrast content [1]. [1] also talks about
adaptively controlling the image sensor using adaptive CNN-UM (Cellular Neural
Network- Universal Machine).
Due to improper or uneven lightening conditions important information can be
lost during sensing. If this is the problem, with a captured image, then the infor-
mation cannot be improved any further. But if the captured image is improper for
human visual perception, a kind of visualization problem, then this adaptive image
sensing can be used to improve. The novelty in this method is the use of contrast
content for adaptability, and no nonlinear templates are used. Moreover contrast
enhancement is included in the enhancement.

4.2.1 Implementation details


The method that is used for this implementation, called Additive image enhance-
ment based on contrast and intensity is developed by Brendel et.al. [1]. Brendel
et.al. uses the sum of contrast and intensity functions to achieve enhancement. Let
I(x, y) represent a grayscale image, according to the CNN convention, on the range
[-1,1] (black=1, white=-1). The image is assumed to be sampled in space, i.e. I is
a matrix I ∈ [−1, 1]M ×N , where MxN is the size of the image.
The goal here is to find simple linear techniques. First, the square intensity
and contrast are computed and denoted as I(x,y) and C(x,y). Second, the diffusion
D is used to smooth these values in a given range. At last, a compensation mask
is computed as a monotonically decreasing function of the diffused contrast and
intensity, respectively. This function was chosen to be: f (x) = c1 (l − c2 x)n ,where
c1 , c2 and n are constants. Compensation is computed as the multiplication of the
mask and intensity or contrast. The intensity and contrast compensation are added
to the original image. The resulting equation of the adaptive contrast and intensity
enhancement transformation (ACIE) is expressed as follows:

I(x, y) :
I(x, y) + k1 I(x, y)(1 − k2 D(I 2 (x, y)))n + k3 C(x, y)(1 − k4 D(C 2 (x, y)))m
= I(x, y) + k1 I(x, y)Mi (x, y) + k3 C(x, y)Mc (x, y) (4.1)
CHAPTER 4. CONTRAST ENHANCEMENT USING CNN 48

where k1 , k2 , k3 , k4 , n and m are parameters. Parameters k1 and k3 control the


magnitude of intensity and contrast correction respectively and k2 , k4 control the
selectivity of the correction; n and m control the character of the compensation
function. The term, k1 I(x, y)Mi (x, y) is the intensity enhancement term and the
third term, k3 C(x, y)Mc (x, y) is for contrast enhancement. The inspirations for this
model is from the work of Smirnakis et al. [26]
The range of diffusion i.e. the template coefficients or execution time of the
diffusion template controls the range considered in adaptation. This is because
when the number of coefficients in the diffusion template given in equation 4.3
increases the range of diffusion increases. Also, when the time increases we find
the same effect. The basic phenomenon here is that when the diffusion is applied,
the brightness of a pixel is diffused on to its neighboring pixels. When this diffused
image is added to the original image, the contrast around a bright region increases.
This results in equalization of the image. The templates used for calculating the
diffusion and contrast maps are shown below:
   
0 0 0 −0.6 −0.6 −0.6
A =  0 1 0  , B =  −0.6 0.48 −0.6  , I = 0; (4.2)
0 0 0 −0.6 −0.6 −0.6

(The CONTRAST template set)

   
0.1 0.15 0.1 0 0 0
A =  0.15 0 0.15  , B =  0 0 0  , I = 0; (4.3)
0.1 0.15 0.1 0 0 0

(The DIFFUSION template set)

4.2.2 Results
For the sake of simulation the values of the parameters used are as follows:

k1 = 1;
k2 = 2;
k3 = 100;
k4 = 5; and
m = n = 3;

The result of enhancing the image using equation 4.1 and the templates given in
4.3, 4.2 on the image shown in figure 4.2(a) is given in figure 4.2(b).
Figure 4.3 displays two plots which show the comparison for pixel values from a
row, but one from the original image and other from the enhanced image.
It can be seen from the plots that the pixel values are enhanced adaptively. Fig-
ure 4.4 shows the difference between an adaptively enhanced image and a uniformly
CHAPTER 4. CONTRAST ENHANCEMENT USING CNN 49

(a) Input image taken for (b) Enhanced output image


enhancement

Figure 4.2: Input and output of ACIE method

enhanced image. An image can be enhanced by adding a constant value to all its
pixels or multiplying all its pixels by a constant value. This method of enhancement
of the image is called Uniform enhancement. The plot between the pixel values (from
row number 180) of adaptively enhanced image and uniformly enhanced image are
shown in figure 4.5
The results of using this algorithm on another test image are shown in figures 4.6.
From these results, it is clearly shown that the algorithm works adaptively. A real
life situation where contrast enhancement could help in perceiving the environment
in a better way could be as given below:
Imagine a night drive scenario, and a big truck is traveling in the opposite lane
in the other direction. And the head lights of the truck are making us blind, and
the environment around the truck is not clearly visible. This situation can also
be depicted as shown below in figure 4.7. This image could be enhanced by using
the afore mentioned additive method. The result of using this method is shown
in figure 4.8. But this result is not satisfactory, i.e. the enhancement is not good
enough yet for human perception.
Since the basic idea of this method is based on diffusing the intensity of a pixel to
its neighbors, the obvious solution to further enhance this result shown in figure 4.8
is to increase the time of diffusion. But one problem with this solution is that as the
time increases, the amount of data that has to be processed by the simulator is also
increasing. This lead to “out of memory” errors in Matlab (The system on which the
simulation is running has 2MB of virtual memory and a physical memory of 1.5MB).
This lead to the idea that instead of increasing the simulation time, keep it constant
and use the output as input to the enhancement engine. This way the engine works
for the same simulation time always but the input image is given iteratively. The
results of successive passes of the input image are shown in figures 4.9- 4.11.
CHAPTER 4. CONTRAST ENHANCEMENT USING CNN 50

0.7
values in the 100th line of the enhanced image
values in the 100th line of the original image
0.6

0.5

0.4

0.3

0.2

0.1

0
0 50 100 150 200 250

(a) Comparison of values on row 100


1
Pixel values on line number 180 of the enhanced image
0.9 Pixel values on line number 180 of the original image

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0
0 50 100 150 200 250

(b) Comparison of pixel values on row 180

Figure 4.3: Plot showing a comparison of pixel values, in a row, between original
and enhanced image
CHAPTER 4. CONTRAST ENHANCEMENT USING CNN 51

(a) Uniformly enhanced image by (b) Adaptively enhanced image


addition

(c) Uniformly enhanced image by (d) Adaptively enhanced image


multiplication

Figure 4.4: Comparison between adaptively enhanced image and uniformly en-
hanced images
CHAPTER 4. CONTRAST ENHANCEMENT USING CNN 52

From the above results it can be seen that as the enhancement is being per-
formed we also have the accumulation of high intensity pixels around the brightest
region. This could only be eliminated by increasing the simulation time (diffusion
time). As increasing the diffusion time distributes these high values uniformly in the
surrounding pixels. But this balance between the number of passes and the diffusion
time can only be achieved when these templates are implemented on a real analog
platform.

4.3 Summary
In this chapter the basics of contrast enhancement are mentioned with the relevant
traditional methods for performing contrast enhancement. A technique to adaptively
enhance the contrast of an image based on the intensity and contrast values on CNN
platform is presented. One good feature about this method is that, it only contains
linear templates. The results are shown and graphs are plotted to show the result of
image enhancement. A qualitative comparison is made between the results from the
adaptive method and the results of the uniform enhancement. Also a quantitative
comparison is presented in the form of a plot between the pixel values of a single
line in the enhanced image. A real life example where the contrast enhancement
can be used is presented, with a good result.
CHAPTER 4. CONTRAST ENHANCEMENT USING CNN 53

250

Adaptively enhanced pixel values


Uniformly enhanced pixel values
Pixel values from original image
200

150

100

50

0
0 50 100 150 200 250 300

(a) Comparison between adaptively enhanced image and uniformly enhanced image via
addition
300

Adaptively enhanced pixel values


250
Uniformly enhanced pixel values
Pixel values from original image

200

150

100

50

0
0 50 100 150 200 250 300

(b) Comparison between adaptively enhanced image and uniformly enhanced image via
multiplication

Figure 4.5: Plot showing a comparison of pixel values, in a row, between original
and enhanced image
CHAPTER 4. CONTRAST ENHANCEMENT USING CNN 54

(a) Input image

(b) Enhanced output image

Figure 4.6: Result of the ACIE method


CHAPTER 4. CONTRAST ENHANCEMENT USING CNN 55

Figure 4.7: Situation where the head lights make the surrounding environment dull

Figure 4.8: Result of enhancing figure 4.7


CHAPTER 4. CONTRAST ENHANCEMENT USING CNN 56

Figure 4.9: Result of second pass (passing the result from the first step again as
input to enhancement engine)

Figure 4.10: Result of third pass


CHAPTER 4. CONTRAST ENHANCEMENT USING CNN 57

Figure 4.11: Result of fourth pass


Chapter 5

Modeling CNN cell on Simulink

5.1 Need for implementation of CNN on MAT-


LAB/Simulink
For development of embedded applications, the methods of hardware and software
co-design are often needed. A designer should on one hand has to solve the algo-
rithm design and on the other hand, should be able to strike a balance between the
flexibility (software implementation) and performance (hardware implementation).
This task grows very complex as the applications go complex.
At the level of rapid algorithm design, MATLAB is often used for block spec-
ifications and their inner analysis. This is because the MATLAB environment is
characterized by its high level scripting possibilities, strong support for matrix op-
erations, object oriented approach, extensive graphing possibilities and rich-set of
application-specific toolboxes.
On contrary, the hardware part of the target implementation has traditionally
been designed in a Hardware description language (often VHDL or Verilog). Hence
for every other program it is necessary to manually recode the specification, which
in itself is an error-prone process.But with MATLAB/Simulink, most of this process
can be eliminated. In a way it generates the HDL code by automatically parsing the
high-level specifications. Also MATLAB/Simulink with its rich variety of toolboxes
helps in this automation process.
This work was not aimed at developing a model of the CNN cell out of DSP
builder blocks, but to develop a model of the cell using normal MATLAB/Simulink
blocks. This model could be used as a guide for developing a model using DSP
builder. The only necessity being the knowledge of HDL basics.

58
CHAPTER 5. MODELING CNN CELL ON SIMULINK 59

5.2 Review of the process of modeling CNN in


MATLAB
This section gives an idea of the process in which CNN could be created in MATLAB.
This could in-turn help in realizing cell on top of Simulink.

• Modeling a cell in MATLAB infers modeling the differential equation that


defines the cell. So, the basic aim here is to solve a differential equation using
MATLAB. This call for using the ODE solver, be it ODE23 or ODE45 or other
range of solvers that MATLAB provides.

• If we look at the equation of a cell in equation 2.12, we see that the input
to the ODE solver is the result of summation of some signals. So, we need
to realize this summation in MATLAB. This could be simply done by adding
up all the required signals. This corresponds to using summation block in
Simulink.

• Again looking at equation 2.12, the signals that are added up are, the state
value, threshold and a couple of convolution results. So, in-order to real-
ize these convolutions, MATLAB provides a function that does 2-dimensional
convolution, ‘conv2’. But in Simulink there is no predefined convolution block,
so we need to develop one.

• After realizing a single cell then it is necessary to iteratively use this same cell
to cover the whole image.

• The necessary inputs have to be read in before performing any of these oper-
ations.

5.3 Implementing CNN on top of MATLAB/Simulink


This section gives the details of the implementation of a CNN cell on top of Simulink,
using the guide lines given in section 5.2. The main aim of implementing CNN cell
on Simulink as discussed in section 5.1 is to help the future deployment of the CNN
processor on an FPGA (Field Programmable Gate Array). The process of converting
a Simulink model and soft-coring it onto an FPGA board is as follows:

• Simulink model is converted to HDL (Hardware Description Language) code


using DSP (Digital Signal Processing) builder, which is a toolbox for MAT-
LAB.

• This HDL code generated is soft-cored onto the FPGA board.

Based on the guidelines mentioned above, a model of Cell in MATLAB/Simulink


is as shown in figure 5.1. This figure gives the abstract model of a cell, without any
CHAPTER 5. MODELING CNN CELL ON SIMULINK 60

(Block showing the direction)

Template -1/R
-1/R
Outputs A (gain block)
from other
cells

I ∑ ∫1 / C
Xij Yij

(output function)
Template (summation) Xij (0)
Inputs B

(Integration)
(convolution block)

Figure 5.1: Graph of a cell in Simulink

details on how the input image is fed at the input port. This abstract model could
be modified to develops a full working CNN cell. The input and the outputs that
are fed into the summation block are modified. These modified blocks should be
able to take the input and the output as images and then use the templates B and
A respectively for convolution.
For the purpose of convolution of the image and the template, a “2-D Convolu-
tion” block from the video and image processing toolbox of Simulink could be used.
But there could be a problem at a latter stage with the use of this block. This
problem could be from the use of DSP builder toolbox. So, a complete 2-D convolu-
tion block has been designed by using only basic blocks from the signal processing
toolbox. This ensures that the DSP builder could convert the built Simulink model
to HDL code.
2-D Convolution: The aim here is to develop a block that could perform a
2-D convolution, i.e. convolve the template over the whole image. First a block is
designed which could do convolution between two 3 × 3 matrices. Then this block is
iteratively moved throughout the image for achieving a 2-D convolution. The model
developed for a 3 × 3 convolution block is shown in figure 5.2. This model takes a
3 × 3 matrix and the template as inputs. Then chooses the corresponding elements
of the two matrices and multiplies them. Finally all the values obtained from the
multiplication are added up. This is the result of convolution between two 3 × 3
matrices.
The subsystems that are present in figure 5.2 are shown in figure 5.3. These
subsystems use two 2-D selector blocks and a multiplier block. The 2-D selector
block is used to select the corresponding elements from both the matrices. It takes
CHAPTER 5. MODELING CNN CELL ON SIMULINK 61

Figure 5.2: 3x3 matrix convolution block


CHAPTER 5. MODELING CNN CELL ON SIMULINK 62

Figure 5.3: Subsystem from the 3x3 convolution block

in the matrix and the two indices that are used for selecting a particular element
as the inputs and gives the element from the input matrix as the output. Then the
both output elements are multiplied using the multiplication block and are given as
output from the subsystem.
Once we are able to develop a 3 × 3 convolution block, the last step towards
developing a full 2-D convolution block is to add for loops and iterate through the
image. So, two for-loops are to be placed in order to traverse through the image.
After putting in the loop blocks, we need to extract the 3 × 3 matrix from the input
image and from the output matrix, for every iteration. To accomplish this task a
subsystem called, 3 × 3 extractor is created. This is shown in figure 5.4.
The building block of this 3 × 3 extractor is shown in figure 5.5. It consists of
a selector block and a overwrite value block. The selector block selects a particular
element from the input matrix and puts this element in a random matrix, at a given
position. The position form which the element is extracted and the position in which
it is replaced are incremented until a complete 3 × 3 matrix is generated. This way
a 3 × 3 matrix is extracted and this is sent to the convolution block. The result of
the convolution between two 3 × 3 matrices is expressed in the form of values. These
values are collected in a buffer until the buffer holds exactly the same number of
values as the width of the image. Then these values are released as an array and are
collected in another buffer until the number equals the height of the image. Then
the whole matrix is given as output. This whole flow can be seen in figure 5.6.
CHAPTER 5. MODELING CNN CELL ON SIMULINK 63

Figure 5.4: 3x3 extractor subsystem


CHAPTER 5. MODELING CNN CELL ON SIMULINK 64

Figure 5.5: Building block of the 3x3 extractor block

Figure 5.6: Figure showing how the information flows after the convolution is per-
formed
CHAPTER 5. MODELING CNN CELL ON SIMULINK 65

5.4 Results
The results obtained from the 2-D convolution blocks are fed into the summator
together with the bias values. These results are given to an integrator. The required
settings for the integration block can be set with the help of the configuration
parameters in the simulation menu. The result of integration is sent through the
PWL-Sigmoid block. This block implements the piece wise linear sigmoid function
that limits the state values. This is the output of the cell. The result from this
block can also be displayed in a GUI, but it is not shown here. The contents of this
block, PWL-sigmoid are shown in figure 5.7.

Figure 5.7: Block implementing piece wise linear sigmoid function

The final model of the cell developed has the same basic structure as shown in
figure 5.1. It is shown in figure 5.8
CHAPTER 5. MODELING CNN CELL ON SIMULINK 66

Figure 5.8: Model of a Cell in Simulink


Chapter 6

Conclusions and future directions

6.1 Conclusions
As discussed in the abstract the main aim of this thesis is to answer the following
research questions:
• How far CNN based processing can be used to realize edge detection?

• How robust would CNN based image processing be under difficult conditions,
especially difficult lighting conditions?

• How can CNN be efficiently implemented on top of MATLAB/Simulink, in


view of rapid prototyping?
As a solution to the first question, a novel edge detection technique, the “Trans-
lation residue method”, has been proposed in this thesis. Templates have been
designed to implement this algorithm in CNN. The results have been simulated in
MATLAB, using the proposed templates. These results have been compared with
the results obtained from some classical image processing techniques. This trans-
lation residue method has been developed, from an old method of calculating the
edge map, called Dilation erosion method. A comparison of the results have shown
that with the proposed method, no artifacts are generated as is the case with the
dilation erosion method.
In order to show the robustness of CNN we have taken a scenario where the
subject is illuminated in poor lighting conditions. And effort has been put to improve
the visibility of the subject. A solution for this kind of problem was already proposed
by Csapody et.al. [3] called, “Additive image enhancement based on contrast and
intensity”, has been studied. This method has been implemented in MATLAB with
the necessary modifications. The results of the simulation were presented and were
very satisfying. Since this operation is performed after an image is captured, it is
not possible to retrieve data that has not been captured, but the captured data

67
CHAPTER 6. CONCLUSIONS AND FUTURE DIRECTIONS 68

can be enhanced. This method can be used to improve the visibility of a driver
driving at night. Law enforcing agencies can also use this method to enhance a
poorly captured image of a number plate or of a driver. Consider a situations where
an automatic camera is used to enforce the speed limits. In these situations the
real time capabilities of CNN allow the analog camera to take as many pictures as
possible of the culprit in a short period (since the vehicle would be moving at a very
high speed). The ability to enhance the image adaptively makes sure that many of
these captured images have clear details about the subject.
A model of a basic CNN cell has been developed in MATLAB/Simulink. This
could efficiently be used for rapid prototyping of the CNN algorithms on FPGA.
This model has been developed using only the basic signal processing blocks, so
that it would be easy in the later stages when the DSP builder1 is used to convert
this model into HDL code. This HDL code when soft-cored on FPGA could build
a CNN on top of FPGA. This would ultimately help in rapid-FPGA-prototyping,
which would come in handy at the later stages of the project, which will make
development easier in later stages of the project.

6.2 Future directions


Regarding edge detection and contrast enhancement: As mentioned in A,
nonlinear templates have been inspired from the nature. As we know the nature has
the best solution for vision problems, we can be sure that nonlinear templates can
perform better than linear templates. So the method proposed, which uses linear
templates could be modified by developing nonlinear templates. The same could
also be attributed to contrast enhancement. Some smoothing techniques could be
implemented in parallel to reduce the impact of noise on results. Also a template
set could be developed which would create the effect of “black glasses” only on the
regions of high intensity(the area where the headlight is present in figure 4.11) so
that those bright spots could be efficiently removed.

FPGA implementation: The model of the cell developed in Simulink could be


used to develop a two dimensional CNN on an FPGA to speed up the information
processing. Further such a system could be integrated into the “DriveSafe” project
to perform some of the basic pixel operations. Efforts can be put on processing
the information from different cameras and integrate the results at a higher level of
abstraction. For example to implement stereo vision algorithms using CNN. This
could significantly improve the performance of advanced driver assistance systems
(ADAS).

1
DSP Builder is a digital signal processing development tool that interfaces between Quartus
II software and The MathWorks MATLAB/Simulink tools.
Appendix A

A note on nonlinear templates

As discussed in Section 2.1.2, the addition of Nonlinear templates had made CNN a
more general paradigm. This section deals with topics like why the nonlinear tem-
plates are introduced uses and disadvantages of nonlinear templates. At last some
comparisons are made between the performances of linear and nonlinear templates
in the special case of simulation on MATLAB.

A.1 Purpose of introduction


Primary purpose of introduction of nonlinear templates is to make CNN a powerful
framework for general analogue array dynamics
• Even with out nonlinear templates the CNN is still a dynamic Nonlinear pro-
cessor.
• The nonlinearity is due to the presence of a nonlinear output equation.
• All the other elements of a cell, i.e. the VCCSs are all linear. This is a
limitation for its scope of application.
• For CNN to become a general framework the nonlinear interactions have been
added.

A.2 Uses
• Nonlinear cloning templates allow the modeling of some biological properties
of retina

– Until the concept of CNN, there were no adequate tools to study the
complex transformations in space and time that take place within a simple
vertebrate retina.

69
APPENDIX A. A NOTE ON NONLINEAR TEMPLATES 70

– Modeling of these complex interactions in space and time could only be


possible with the induction of nonlinear templates and delay templates

• Because we are able to model the complex transformations that are taking
place in a retina we could also be able to study how these transformations
help in performing complex vision tasks, like edge detection, motion detection,
encoding of the direction of motion, etc.

– These studies in turn, could help in developing templates which could be


used to perform complex vision tasks.

• Adaptability can be achieved by using nonlinear templates

– Since our template values are no more linear variations of some voltage,
we can change them based on more than one parameter. For example in
a linear template there is a fixed bias value, but if there is a situation
where we need to vary the threshold, based on the value of the input
pixel and the current state, then using the nonlinear template is the only
way of achieving this.

One of the major disadvantage of using nonlinear templates is that the system is
more prone to become unstable. Note however the use of nonlinear template reflects
the dynamics of real-world systems, events or phenomenon.

A.3 Remarks with respect to MATLAB simula-


tion
When it comes to simulating using Matlab/Simulink

• Linear template simulations are easier to program and runs faster than non-
linear template simulations

• When it comes to running on a CNN-UM even though a nonlinear template


takes more time, this may be insignificant, since we have a real parallel pro-
cessing
Appendix B

The art of CNN template design

This appendix presents some ways of designing templates for coupled and uncou-
pled linear CNN with binary inputs and outputs. The original paper was from
kos Zarndy [2] in 1997. According to [2], there are three major template design
methods:
1. intuitive way
2. template learning and
3. direct template derivation
the first requires intuitive thinking of the designer. In several simple cases it leads
to quick results. However, it is not guarantee to find the desired template. Moreover
designers need to have lots of experience in both image processing and the array
dynamics.
The second design method, the template learning, is an extensively studied,
popular field of the CNN research. Most of the classical neural network training
methods are implemented in CNN. Learning is based on desired input and output
pairs. These pairs are used to develop better and better templates using learning.
But the problem with CNN is that in several cases, the template either works or
does not work. So this type of learning is reduced to mere brute force methods in
these cases. Another problem is that in some cases a template for the given problem
does not exist. But the learning methods cannot recognize this and they keep on
running forever. Another problem is with the cases where no explicit desired output
exists, like in the case of texture segmentation. In these cases the learning method
could not be used at all.
The third method is the direct template design. This can be applied when the
desired function is exactly defined. This method needs only a small fraction of
computational power when compared to the template learning methods.
The original paper [2] explains these three methods with some examples, which
could unnecessarily make this thesis lengthy.

71
Appendix C

Pseudocode for CNN


implementation

The following pseudocode was used to generate a Matlab code that was used for
simulation of all the results shown in this thesis. This pseudocode is for simulating
linear templates only. The pseudocode is as follows:
• Accept an image, U as input

• Convert U to double values

• Convert the values of matrix U in the range ’-1’ to ’1’, with ’-1’ representing
white and ’1’ representing black

• Accept the templates (matrices) A, B and I

• Specify the type of color map used for displaying the input and output images,
for example a 64bit or 32bit.

• Accept the time of simulation, tf

• Initialize the initial state values

• Convolve the input matrix(image) with template B

• Call a function which implements the PWL-sigmoid function, sending the state
values as argument and take the output, i.e. y values

• Convolve the output matrix with the template A

• Sum up both the convolution results and the bias and send this result to a
ODE solver as argument. Also the simulation time is sent as an argument.

• This result, i.e., the result from integration step is shown in the GUI

72
Bibliography

[1] M. Brendel and T. Roska, “Adaptive image sensing and enhancement using the
cellular neural network universal machine: Research articles,” Int. J. Circuit
Theory Appl., vol. 30, no. 2-3, pp. 287–312, 2002.

[2] Á. Zarándy, “The art of cnn template design,” International Journal of Circuit
Theory and Applications, vol. 27, pp. 5–23, 1999.

[3] T. R. M Csapody, “Adaptive histogram equalization with cellular neural net-


works,” Proceedings of IEEE International Workshop on Cellular Neural Net-
works and Their Applications, pp. 81–86, 1996.

[4] L. Chua and L. Yang, “Cellular neural networks,” Circuits and Systems, 1988.,
IEEE International Symposium on, pp. 985–988 vol.2, Jun 1988.

[5] L. Chua and L. Yang, “Cellular neural networks: theory,” Circuits and Systems,
IEEE Transactions on, vol. 35, pp. 1257–1272, Oct 1988.

[6] L. Chua and L. Yang, “Cellular neural networks: applications,” Circuits and
Systems, IEEE Transactions on, vol. 35, pp. 1273–1290, Oct 1988.

[7] J. J. Hopfield, “Neural networks and physical systems with emergent collective
computational abilities,” PNAS, vol. 79, pp. 2554–2558, April 1982.

[8] S. Wolfram, “Computation theory of cellular automata,” Communications in


Mathematical Physics, vol. 96, pp. 15–57, 1984.

[9] M. G., A. P., and F. L., Cellular Neural Networks - Chaos, Complexity and
VLSI Processing. Springer, 1998.

[10] L. Chua and T. Roska, “The cnn paradigm,” Circuits and Systems I: Funda-
mental Theory and Applications, IEEE Transactions on, vol. 40, pp. 147–156,
Mar 1993.

73
BIBLIOGRAPHY 74

[11] T. Roska and L. Chua, “The cnn universal machine: an analogic array com-
puter,” Circuits and Systems II: Analog and Digital Signal Processing, IEEE
Transactions on, vol. 40, pp. 163–173, Mar 1993.

[12] T. Roska and L. Chua, “Cellular neural networks with nonlinear and delay-type
template elements,” Cellular Neural Networks and their Applications, 1990.
CNNA-90 Proceedings., 1990 IEEE International Workshop on, pp. 12–25, Dec
1990.

[13] Vandewalle, Cellular Neural Networks. New York, NY, USA: John Wiley &
Sons, Inc., 1994.

[14] S. Xavier-de Souza, M. Van Dyck, J. Suykens, and J. Vandewalle, “Fast an


robust face tracking for cnn chips: Application to wheelchair driving,” Cellular
Neural Networks and Their Applications, 2006. CNNA ’06. 10th International
Workshop on, pp. 1–6, Aug. 2006.

[15] T. Matsumoto, L. Chua, and H. Suzuki, “Cnn cloning template: connected


component detector,” Circuits and Systems, IEEE Transactions on, vol. 37,
pp. 633–635, May 1990.

[16] J. Canny, “A computational approach to edge detection,” IEEE Trans. Pattern


Anal. Mach. Intell., vol. 8, pp. 679–698, November 1986.

[17] I. Pitas and A. N. Venetzanopulos, “Edge detectors based on nonlinear fil-


ters,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 8,
p. 538550, 1986.

[18] L. O. Chua and T. Roska, Cellular neural networks and visual computing: foun-
dations and applications. New York, NY, USA: Cambridge University Press,
2002.

[19] G. D, “Information theory on electron microscopy,” Laboratory Investigation,


vol. 14, pp. 801–807, 1965.

[20] S. W. Z. A. Lev and A. Rosenfeld, “Iterative enhancement of images,” IEEE


Transactions on Systems, Man and Cybernetics, vol. SMC-7, pp. 435–447, 1977.

[21] P. Perona and J. Malik, “Scale-space and edge detection using anisotropic dif-
fusion,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 12, no. 7, pp. 629–639,
1990.

[22] C. Rekeczky, T. Roska, and A. Ushida, “Cnn-based difference-controlled adap-


tive nonlinear image filters,” International Journal of Circuit Theory and Ap-
plications, vol. 26, pp. 375–423, 1998.
BIBLIOGRAPHY 75

[23] J. Lee, R. Haralick, and L. Shapiro, “Morphologic edge detection,” Robotics


and Automation, IEEE Journal of [legacy, pre - 1988], vol. 3, pp. 142–156, Apr
1987.

[24] W. K. Pratt, Digital Image Processing: PIKS Scientific Inside. Wiley-


Interscience, 2007.

[25] G. X. Ritter and J. N. Wilson, Handbook of Computer Vision Algorithms in


Image Algebra. CRC Press, Inc., 2000.

[26] S. M. Smirnakis, W. Bialek, M. Meister, D. K. Warland, and M. J. Berry,


“Adaptation of retinal processing to image contrast and spatial scale,” Nature,
vol. 386, pp. 69–73, 1997.

You might also like