You are on page 1of 49

Blood Clot Anomaly Detection using Generative Adversarial

Networks Approach
Thesis
Submitted in partial fulfillment of the requirements of
BITS F421T

Thesis By
Suyash Sharma

ID No. 2017B4A10627P

Under the supervision of

Dr. Arkaprovo Ghosal

Associate Professor, Department of Chemical Engineering

BIRLA INSTITUTE OF TECHNOLOGY AND


SCIENCE, PILANI

December 27th, 2021

1
ACKNOWLEDGEMENTS

During The course of the thesis, I had continuous support, encouragement, and guidance from

Dr.Arkaprovo Ghosal and his invaluable contribution to the report.

I am extremely grateful to Dr. Banasri Roy (HoD, Chemical department) for encouragementand

motivation for the project ensuring us every support possible from her and the department

overall.

I am also thankful to all the Teaching staff of the Chemical department for supporting me in all

the other ways, thus helping us in successfully completing the project work. I would also like to

thank all the non-teaching staff for their contributions and support.

2
CERTIFICATE

This is to certify that the Thesis entitled, Anomaly Detection in blood flow surrounding an artery

clot using GAN based approach submitted by Suyash Sharma, ID No. 2017B4A10627P in partial

fulfillment of the requirement of BITS F421T Thesis embodies the work done by him under my

supervision.

Date: 27.12.2021 Signature of theSupervisor


Dr. Arkaprovo Ghosal
Associate Professor,
Department of Chemical
Engineering

3
ABSTRACT

Anomaly detection in the industrial sector is a fundamental problem as it is a critical component

of quality control systems that minimize the chance of missing a vital anomaly point. Anomaly

detection is often done by analyzing the patients' blood flow images. Because the medical images

of patients are hard to obtain, this problem is approached in an unsupervised manner.

Generative adversarial networks (GANs) can capture the distribution of their inputs. They are

then used to learn the distribution of regular data and then detect anomalies, even if they are

scarce. However, a significant disadvantage of GANs is instability during training.

This Thesis describes a Generative Adversarial Network (GAN) based approach for blood clot

detection on colour Doppler ultrasound images of blood vessels. Given the challenges of

gathering haemorrhage data and the natural pathology variability, we study an unsupervised

anomaly detection network that learns a manifold of normal blood flow variability and identifies

anomalous flow patterns outside the learned manifold.

4
TABLE OF CONTENTS

AKNOWLEDGMENT.............................................................................................2

ABSTRACT..............................................................................................................4

TABLE OF CONTENTS..........................................................................................5

Chapter 1:Introduction

1. Introduction...............................................................................................7

2. Research Objectives..................................................................................9

3. Research methodology and thesis outline.................................................9

Chapter 2:Background study and literature review

4. Artificial Neural Networks......................................................................11

5. Convolutional Neural Networks.............................................................12

6. Transfer Learning....................................................................................12

7. Autoencoders..........................................................................................13

8. Generational Adversarial Networks........................................................14

9. Data Augmentation with GANs..............................................................16

10. Representational Learning with GAN....................................................16

11. Anomaly Detection Methods..................................................................17

12. Isolation Trees.........................................................................................17

5
13. One class Support Vector Machine.........................................................19

Chapter 3:Medical Applications of GAN based Anomaly Detection

1. Anomaly discovery in Endoscopy data...................................................20

2. Predicting Epilepsy.................................................................................20

3. MADGAN...............................................................................................21

4. ABNORMAL CHEST X-RAY IDENTIFICATION WITH GAN ONE-


CLASS CLASSIFIER.............................................................................22

5. Unsupervised Detection of Lesions in Brain MRI..................................25

Chapter 4: Architecture and Analysis of Anomaly Detection GAN Models


1. Dataset...................................................................................................25

2. AnoGAN...............................................................................................26

3. WGAN...................................................................................................27

4. AnoWGAN............................................................................................28

5. GANomaly............................................................................................30

6. Proposed Architecture and Steps...........................................................31

Chapter 5: Experiments and Results..................................................................34

Chapter 6: Conclusion.........................................................................................40

Bibliography and Referenced Literatures..........................................................41

6
Chapter 1: Introduction

1. Introduction

In recent times, there has been a dramatic increase in the use of Deep Learning-based medical

applications in various applications, from receiving fraudulent sales to harmful brain plants.

Since bizarre behavior is often costly (or dangerous) to the system, it is not easy to collect

enough data to represent such behavior. Generational Adversarial Networks (GANs) have drawn

significant attention to discovering mysterious acquisitions because of their unique ability to

generate new data.

Many systems rely on and produce large amounts of data in today's society. This data is essential

in many decision-making processes related to these systems. Typically, systems operate under

standard conditions. However, in rare cases, anomalies are possible. Such abnormalities can have

a detrimental effect on the system itself or its environment. Therefore, it is essential to identify

the confounding factor in advance to minimize the impact. For example, cancer is an anomaly in

human tissue. Breast cancer is the second leading cause of death in women [1]. According to a

recent study by the American Cancer Society [2], breast cancer alone accounts for 30% of

women with cancer. Early detection and treatment of breast cancer can significantly increase the

chances of survival [3].

The process of detecting anomalous behavior of a system is called anomaly detection. The main

purpose of anomaly detection is to distinguish between expected and unexpected behavior of the

system. Given the importance of the detection of anomaly, it has received widespread attention

7
in

8
research.

GANs are a type of unsupervised generative model that has received a lot of attention in the

research community. A well-trained GAN can produce real-time data by taking samples from

distributed data. GAN contains a generator and a discriminator model. Both models compete in a

two-player zero-sum game mode, repeatedly improving their data production and discrimination

capabilities.

GAN's ability to generate data makes it attractive in research to find ambiguities in two

perspectives. First, they can help generate ambiguous data points that are hard to find. Second,

they can be used to study the distribution of data for the normal operating system of the system

and to act as an anomaly or outlier detector.

Haemorrhages are a severe threat to human life, and its timely and correct diagnosis and

treatment are of great importance. Multiple types of haemorrhage are distinguished depending on

the location and character of bleeding. The main division covers five subtypes: subdural,

epidural, intra-ventricular, intra-parenchymal, and subarachnoid haemorrhage. This paper

presents an approach to detect these anomalous haemorrhages in doppler ultrasound images of

the head. The model trained for each haemorrhage subtype is GAN based on the AnoGAN and

GANomaly architectures.

9
2. Research Objectives

The following are the objectives of this thesis:

A big problem in machine learning is the availability of data. Different approaches to generating

and growing datasets exist, but none are perfect. This thesis examines whether utilising

generative adversarial networks in the data generation process can improve anomaly detection

results alone or combined with previous data generation techniques.

Conventional methods generally encounter many obstacles in modeling blood flow and detecting

anomalies within complex geometries as they lack the quality and quantity of pathology in

medical images.

The following tasks: • Detect Anomalies present in the doppler Ultrasound Images of patients.

• Generate images based on the input provided .

• Investigate the applicability and efficiency of the SPH method for modeling thrombus

formationduring blood flow in vessels with different geometries.

• Assess the influence of various parameters such as flow rate and vessel geometry on thrombus

formation and separation.

3. Research methodology and thesis outline

The work described in this thesis mainly focuses on the GAN based anomaly detection in blood

flow in arteries or vessels with different flow and material properties. The thesis is structured as

10
follows:

Chapter 2 briefly provides literature and theoretical knowledge on Deep Learning and

computational methods used and developed by researchers in the past for anomaly detection.

Chapter 3 we take a look at the state of Anomaly detection and generative adversarial networks

today in the Medical Domain.

Chapter 4 explains the implementation of multiple GAN based anomaly detection models and

finding out the best architecture for our use case.

Chapter 5 presents the results from testing each approach, including both the existing ones and

the proposed approach.

Chapter 6 concludes the findings of this thesis work by explaining impacts of various

architectures on detection of blood clots and future work.

11
Chapter 2: Background study and literature review

Artificial Neural Networks

Artificial Neural networks is an algorithm that can solve numerous different tasks. Each network

consists of multiple layers, where each layer consists of multiple neurons. Each neuron takes various

inputs, combines them with its internal weights and outputs. Artificial neural network in itself is rather

a simple algorithm. Still, once combined with backpropagation and some sort of optimization

algorithm, often gradient descent, they can be compelling and effectively solve many problems.

Fig 2.1 A simple multi-layer perceptron neural network where blue circles represents
neurons in the input layer, yellow circles represent neurons in a hidden layer, and the
green circle represents a neuron in the output layer.

12
Convolutional Neural Networks

Convolutional neural networks are artificial neural networks that consist of three main layer types:

Convolutional Layers, Pooling Layers, and regular fully-connected layers (The same kind of layer used

in standard fully connected artificial neural networks). This is similar to the previously mentioned fully

connected artificial neural network but with a few key differences. One of the differences is that the

architectures explicitly assume that the inputs are images. This allows the network to encode specific

properties into the architecture, which makes the network more efficient to implement and much faster

considering the reduced amount of parameters in the network.

Transfer learning

The idea behind transfer learning is that we can train a neural network model on some task and then

reuse the learned weights of the model (or parts of it) for a related but different job. These models learn

to detect objects in the object recognition tasks, but the classification and detection is done on the

features extracted through convolutional layers. These features are general, especially on the lower

level convolutional layers. Training these layers requires a lot of data, and the available pre-trained

models are trained on big datasets. The learned weights of the network convolutional layers that can

extract good features from an image can be reused in many other tasks (significantly lower level);

features like edges and curves are rather general among many different image tasks.

The whole process of transfer learning is illustrated in Figure 2.2. Standard practice in transfer learning

is to freeze some transferred layers. That means that the layer weights are not changed anymore during

the training process, in a way that we do not allow back-propagation on these layers. This is most

effective when the features learned are very similar and relevant across domains. Often transferred

layers are frozen to speed up the training process, reducing the number of computations to compute
13
Figure 2.2: Transfer learning - Upper part shows a possible architecture of a neural
network that is trained on Task A with input A. The part below in the figure
(separated by a dashed line), shows the architecture of the neural network used
for our task B. Note that the convolution blocks are reused, while the other
layers are changed.

new weights. Usually, convolutional layers would be frozen, to train only the fully-connected layers

that have been attached on top of the pre-trained model. However, in many situations, relevant features

are only partly related across different problems and are more similar in the first convolutional layers

than the latter ones. Sometimes no layers are frozen to allow the network to fine-tune the already pre-

loaded weights to adapt more to the new problem/domain.

Autoencoders

14
Autoencoders are neural networks that learn to encode data to efficiently and then reconstruct the

original data from such encoded models. The goal is to understand the weights of such an encode-

decode pipeline (see Fig. 2.3) so that the reconstruction of the encoded image has a minimal loss.

Figure 2.3: The autoencoder consists of the encoder and the decoder neural
networks that are connected to each other. The encoder takes data as an input (for
example an image) and learns to encode it to an efficient data representation.
Then the decoder network learns to reconstruct the image from that data
representation. The image has been taken from [5]

Moreover, this training is done in an unsupervised manner. Typically autoencoders are used for

dimensionality reduction (feature encoding or extraction), but recently have also been used in

generative models [4].

GAN

15
Generative Adversarial Network is a framework for estimating probability distribution via an

adversarial process. As shown in Fig 2.4, it contains two main parts, a generator (G) model and a

discriminator (D) model, which are trained simultaneously. Both G and D are neural networks.

Figure 2.4: The structure of GAN

purpose of G is to capture the data distribution, while the goal of D is to estimate the probability that a

sample came from the training data rather than G.

A fixed-length noise variable Z is applied for input to learn the generator’s distribution over data X.

The discriminator is trained to maximize the probability of assigning the correct label to generated

samples and samples from the dataset. At the same time, the generator is trained to minimize the

likelihood of the discriminator correctly setting the label of generated models. In general, the

framework is trained by playing a mini-max game with value function V (G, D):

minG maxD Ex∼px x[log D(x)] + Ez∼pzz[log(1 − D(G (Z )))] .


16
In practice, the above Equation may not provide enough gradient for the generator to learn well. At the

early stage of training, when G is poor, D can reject samples from G with high confidence because

they are obviously different from samples from the training dataset.

Data Augmentation with Generative Adversarial Networks

Data augmentation, also known as oversampling, is carried out to compensate for insufficient data in

the dataset to prevent model overfitting. It can also be used to address the problem of data imbalance,

which occurs when the sizes of the classes in a dataset differ considerably. For instance, in a binary

classification task, the class with fewer samples is called the minority class, and the other class is called

the majority class. The corresponding training process would be biased towards the majority class.

Hence a classifier trained using this dataset would have better accuracy for this class [148]. To address

the imbalanced dataset problem, one can randomly remove samples from the majority class to balance

the class size (undersampling), or augment the minority class by adding artificially generated

instances (oversampling) using suitable techniques.

The problem of the imbalanced dataset is more critical in anomaly detection since it is hard and

expensive to collect data on the anomalous behaviour of the system under study. Often, there are very

few or no examples of anomalous data available. GANs can help by generating more samples for the

anomalous class in this situation.

Representational Learning with GAN

17
The main goal of GANs is to learn a generative model that produces realistic-looking data by sampling

from the learned distribution. This abundant power of GANs was highlighted by Goodfellow et al. [6]

and Radford et al. [7]. Representation learning with GANs for anomaly detection uses the ability of

GANs in learning the distribution of a particular class of data. Several anomaly detection techniques

are proposed that use this representation learning ability of GANs. We will explain the concept of

anomaly detection using representation learning through examples of well-known GAN-based

anomaly detection techniques (AnoGAN). All other anomaly detection techniques that rely on model

learning through a GAN are variations to some extent of the AnoGAN technique.

Anomaly detection methods

Anomaly detection is topic which has been extensively studied. Methods can be categorized into three

[8]: supervised, unsupervised and semi-supervised. Supervised methods require annotated datasets to

be able to detect anomalies. The need for annotated data is usually unattainable since anomalies are

rare. Also, the dataset would be highly imbalanced. Unsupervised anomaly detection methods need

only normal instances during training. This is entirely feasible since most of the data in which one

wants to find outliers are not anomalies. Sometimes this discipline is called novelty detection because

it can be described as detecting novelties, which do not conform to the distribution of already

experienced examples. In the subsequent subsections, we describe papers about anomaly detection [8]

and some traditional machine learning methods that can be used for unsupervised anomaly detection.

Isolation Forest

18
Isolation Forest [9] is an unsupervised machine learning algorithm. It is similar to Decision Tree and its

ensemble version, Random Forest [10]. The idea of the Isolation tree is that anomalies require fewer

splits in the decision tree than standard instances. Isolation Forest is then an ensemble of such trees.

The anomaly score of a sample depends on the number of edges from the root to a node to which the

model has been assigned. It is given by

𝔼[h(x)
s(x, n) = 2 ]
c(n)

Where E [h(x)] is the average number of edges which x has to traverse from the root to a terminal

node during classification; c(n) is the average path length of unsuccessful search in a binary search

tree. The number of instances is n. The advantage of the Isolation Forest is its linear computational

complexity in the number of training examples.

One Class Support Vector Machine

One Class Support Vector Machines (OC-SVM) is an unsupervised machine learning algorithm based

on Support Vector Machines (SVM) [11] which, on the other hand, is supervised. It is used for binary

classification. The principle of SVM lies in finding a hyperplane separating the dataset into two

classes. When the partition between individual classes is non-linear, the use of some non-linear

function to transform data points is inevitable. However, when training OC-SVM, instead of separating

data points belonging to two classes, we are trying to find a hyperplane in n-dimensional space such

that training examples are on one side and all other points x  R n lie on the opposite side. Anomaly

19
score is then

determined by the distance of tested data point from the hyperplane

20
Chapter 3: Medical Applications of GAN based Anomaly Detection:

Autoencoders

Since their introduction in [25] as a method for pre-training deep neural networks, AEs have been

widely used for automatic feature learning [26]. Fig. 4 illustrates the structure of an AE. They are

symmetric, and the model is trained to reconstruct the input from a learned representation captured

at the center of the architecture. Formally, let there be N samples in the dataset, the current

information be x, and f and g denote the encoder and decoder networks, respectively. Then, the

compressed representation, z, is given by,

z = f (x)

and reconstructed using,

y = g(z)

This model is trained to minimise the reconstruction loss, L (x, g( f


(x))), ∑
x∈N

GAN based Anomaly Detection

An example of the use of GANs for medical confounding findings in Schlegl et al. [12], in which the

authors used the GAN framework to find anomalies in Optical Coherence Tomography (OCT). They

trained GAN to make standard OCT scanners using a z-dispersive distribution. Then a map compiler is

used for standard OCT scanning to z. Therefore, it should be possible to restore the same image when

you draw a map from the image to z using the connector and from z to the image using the generator. If
21
there is any confusion, the authors point out that there is a discrepancy in this translation and identify

the confusion using this process.

Anomaly discovery in Endoscopy data

This section summarizes some of the most popular in-depth reading structures presented to identify

abnormalities in endoscopies. Because endoscopy devices capture RGB data, CNN is pre-trained in

large-scale acquisition benchmarks such as Image-Net [13]. For example, in [14], the authors used

Xception [15] a well-trained CNN architecture [13] to identify lesions in endoscopy images.

Predicting Epilepsy

The predictive predicament problem can be considered an anomaly detection problem when machine

learning models are trained to distinguish between pre-ictal and interictal brain conditions, to identify

when the brain activity of a particular subject changes from a normal interictal state to a pre-ictal state

(abnormal state). As a pre-ictal condition is a pre-existing state of the brain, this condition is called

predicting seizures. We acknowledge that epileptic seizures predictors have a few exceptions

compared to all other uncommon acquisition application domains we discussed above; however, many

studies have performed this function as a function of finding anomalies [16, [17], [18], and that is why

we look at it here.

The GAN-based approach is shown in Fig. 3.1. The GAN model generator is able to integrate STFT

images that look realistic using an audio vector. The generated STFTs are passed on to the

discriminator, who makes false assertions. Once the generator is trained in the task of predicting

capture, the authors adopt a discriminatory network by adding two fully integrated layers. Trained to

do standard / unusual classification instead of real / fake type. Therefore, the proposed system

enhances the information not only by labeling EEG signals but also integrated samples that are not

labeled in the training program.


22
Figure 3.1 :The architecture proposed in [19] for epileptic seizure prediction

MADGAN

Proposed an unsupervised medical anomaly detection generative adversarial network (MADGAN), a novel two-

step method using GAN-based multiple adjacent brain MRI slice reconstruction to detect brain anomalies at

diferent stages on multisequence structural MRI.

23
Figure 3.2 :Proposed MADGAN architecture for the next 3-slice generation from
the input 3 256 × 176 brain MRI slices: 3-SA MADGAN has only 3 (red-
contoured) SA modules after convolution/deconvolution whereas 7- SA
MADGAN has 7 (red- and blue-contoured) SA modules. Similar to RGB
images, we concatenate adjacent 3 gray slices into 3 channels

ABNORMAL CHEST X-RAY IDENTIFICATION WITH GAN ONE-CLASS CLASSIFIER

Chest radiograph (chest X-ray, or CXR) is a radiological examination that is often requested

because of its effectiveness in diagnosing and diagnosing cardiovascular and pulmonary

abnormalities. It is also widely used in the prevention and evaluation of lung cancer. A timely

report by a radiologist of all images is desirable, but it is not always possible due to the heavy

load. As a result, an automated system of abnormal CXR classification may be useful, allowing

reporting activities that focus more on the pathological analysis of abnormal CXRs.The proposed

adversarial one-class learning framework is inspired by the generative adversarial networks

(GANs).
24
Figure 3.3 Framework of the proposed deep adversarial one-class

learning model for abnormal chest X-ray identification.

Three essential modules, namely, a U-Net [23] like autoencoder (generator), a convolutional

neural network (discriminator) and an encoder network, together constitute the generative

adversarial one-class learning architecture (See Figure ). The U-Net like autoencoder (denoted as

U) first maps an input CXR image xi  T with Gaussian noise µ into a lower-dimensional latent

space z using a fully convolutional network (1 st encoder UE ), which is then inversely mapped

back using a deconvolutional network (decoder UD) to generate the reconstructed image x  T .

The U-Net like encoder-decoder with skip connections is adopted to preserve high-resolution

features through concatenation in the up-sampling (deconvolution) process, and a CNN

discriminator (denoted as D) is looped for adversarial training to produce better and more
25
realistic reconstruction.

A second encoder E is padded after the autoencoder, which further encodes the generated fake

image into another latent space z0 , in order to force the consistency between two latent vectors z

and z0 and corresponding intermediate feature maps from the two encoders.

Unsupervised Detection of Lesions in Brain MRI

Variational Auto-Encoder (VAE) models [20] and related models, such as the Adversarial AutoEncoder

(AAE) [21] have been used successfully to measure high-resolution image distribution and image

capture [22]. They are particularly interesting for the unsupervised detection of abnormal lesions

because they allow for close proximity to a given image in relation to the distribution in which they are

trained. In their default embedded structure, the given image image is first coded and then decoded, i.e.

rebuilt. In both the hidden submissions, it is assumed that the external material is separated from the

normal data samples, and the reconstruction is investigated for external detection [22] in different

computer vision functions.

They studied the detection of brain tumors in an uncontrolled manner by studying the distribution of

brain MRI data to healthy subjects using automotive-based methods. It is considered that one of the

main drawbacks of current models is the lack of compliance in the subtle representation and proposed

a simple but effective limit that facilitates the recording of a wound image close to its healthy

corresponding image in a hidden space.

26
Chapter 4: Architecture and Analysis of Multiple Anomaly Detection GAN
Models

4.1 Dataset

MRI Data from patients with cerebral aneurysms were collected for diagnostic and therapeutic

purposes.

Image data was obtained using the AXIOM Artis C-arm digital output system using a rotating

detection time of 5 seconds and 126 frames (190 ° or 1.5 ° per frame, 1024 x 1024-pixel matrix,

126 frames). Background processing was performed using LEONARDO InSpace 3D. A different

agent (Imeron 300, Bracco Imaging Deutschland GmbH, Germany) was inserted manually into

the internal carotid (anterior aneurysms) or vertebral (posterior aneurysms) of the artery.

Reconstruction of the volume of interest selected by the surgeon produced a series of ~ 220

images of fragments with 256x256 voxel matrices on the plane, resulting in an iso-voxel size of

~ 0.5 mm.

The hallmark of broken aneurysms is subarachnoid hemorrhage (SAH) seen on CT scans

(frequency) or MRI scans. In standard CT or MRI scans but with suspicious symptoms of

subarachnoid bleeding, a common find on cerebrospinal fluid (lumbar puncture) confirms

subarachnoid bleeding. In the case of multiple aneurysms, localization (asymmetry) of SAH

points to a fractured aneurysm. In the absence of SAH asymmetry, where proximal or very large

or, abnormally shaped aneurysm is considered degenerative. In rare cases, the xanthochrom

parenchymal "halo" surrounding the aneurysm seen in surgery confirms the history of bleeding

(minor or previous) from this aneurysm.

27
4.2 AnoGAN

Schlegl et al. [2] introduced a new architecture using competing production networks. The principle is

found in various studies, in which the normal conditions (images of the eye retina in this article) are

dormant. As the anomalies lie in this diversity, it is possible to differentiate. GANs present points from

a hidden location to a data space but not the other way around.

To solve this problem, the authors propose a way to map from the data space to a hidden location.

When the data point x is given, the random point z points from the hidden space are repeatedly

Figure 4.1 : Training and testing using AnoGAN

adjusted to minimize the difference between x and G (z). Say z after γ repeated as zγ. The method,

called feature comparisons, was first proposed by Salimans et al. [27], is used to force a mapped point

to lie on most of the readings. Therefore the LD loss of the discriminator used to map the hidden area

is as follows:

LD (zγ) =
∑ f(x) − f G (zγ)
(

28
where f (·) is the result of a middle ground layer of discrimination. The middle layer is the last layer

but one in the calculation graph of the neural network. The Discriminator Loss forces the model

included in map x to to lie on the learned manifold. These losses are combined with residual losses that

measure similarities between the generated events and the actual data:

LR (zγ) = ∑ x − G (zγ)

Final loss is calculated as a convex combination of residual loss and discrimination:

L (zγ) = (1 − ζ ) ⋅ LR (zγ) + ζ ⋅ LD (zγ)

4.3 WGAN

Wasserstein range

Wasserstein Range (Earth Mover Range) is a matrix distance between the distribution of two

opportunities in a given metric area. Understandably, it can be seen as a small task required to

transform one distribution into another. The function is defined as the product of the mass distribution

to be delivered and the distance to be removed. Statistically, it is defined as:

𝕎 (ℙr, ℙg) = inf 𝔼(x,y)∼γ[∥x − y∥]


γ∈Π(ℙr , ℙg)

Compared to JS, the Wasserstein range has the following advantages:

• Wasserstein Distance continues and is almost universally distributed, which allows us to

train the model to the highest standard.

• JS Divergence fills the space as the racist gets better; thus, the gradients become zero

and disappear.

29
• The Wasserstein range is a logical matrix, i.e., it combines 0 as the distribution approaches

and varies as it goes farther.

• Wasserstein Distance as an objective function is more stable than using JS variants. The

folding problem mode is also minimized when using the Wasserstein range as an objective

function.

Now that it has become clear that setting up Wasserstein Distance makes more sense than developing

JS Divergence, it should be noted that the Wasserstein Distance described in the above Equation

cannot be overstated.

WGAN introduces the critic instead of the racism we know of GAN. The critical network is similar to

the construction of a racist network but predicts the Wasserstein range in preparation for the acquisition

of w *, increasing the Equation below. To that end, the critic's mission is as follows:

Lcritic(w) = maxw∈W 𝔼x∼ℙr [ fw(x)] − 𝔼z∼Z [ fw (gθ(z))

The difference between the discriminator and the critic is that the discriminator is trained to identify

samples from P_r to representatives from P_g accurately. The critic estimates the distance Wasserstein

between P_r and P_g.

4.4 AnoWGAN

The model uses Improved WGAN to learn the manifold of normal samples. Generator changes its

parameters so that it maximizes critic’s output on the generated instance:

LG = min − 𝔼z∼pz[D(G (z))]

Critic, on the other hand, changes its parameters to minimize its output on the generated instance and at

the same time maximize the output on real instances. Consequently, because we are training WGAN

30
with gradient penalty, we need to add the correction to the critic’s loss:

31
LC = min 𝔼z∼pz[D(G (z))] − 𝔼x∼p̂ [D(x)] + 2
data
λ𝔼x∼ ∇x̂D(x̂) − 1)
D
px̂
[( 2 ]

To detect anomalies of one data point x, we need to find its image z in latent space. We have two

versions of our model to see the image in latent space. The first is similar to an organ, it utilizes

iterative mapping, minimizing objective with ζ = 0. This procedure is time-consuming.

AnoWGAN makes use of another neural network, encoder. The difference between our model and the

one proposed by Zenati et al. [23] is that we do not train the encoder concurrently with the generator

and critic. The encoder is trained by minimizing the absolute difference between instance x actual

data and its projection back to the space of real data but through the encoder and already trained

generator:

Figure 4.2 : Pipeline of the GANomaly approach for anomaly detection.


32
LE = minE 𝔼x∼p̂ [ | x − G (E(x)) | ]
data

4.4 GANomaly

The generator network includes the components in sequence, the GE encoder, the GD codec (this part

includes the autoencoder architecture) and the third part is another codec Encoder.

Discriminator network: GAN architecture is complete with Discriminator network.The discriminator

part and generator are the building blocks of a standard GAN architecture.

Discriminatory network training data resources

• Real data events - used as good samples during training and these are real pictures of MRI

Scans

• Fake data conditions - produced and used by the discriminator as negative examples during

training.

The main purpose of discrimination is to distinguish between actual and produced data and properly

trained. if it can't see the difference then the generator produces real images. The generator is

continuously updated to deceive the discriminator.

Loss of generator: The objective function is built using three losses of Minimax Loss, Modified

Minimax Los, Wasserstein Loss. Every loss tries to improve a different part of the whole architecture.

4.5 Proposed Architecture and Steps

To learn the healthy data distribution, we train GAN with the same structure as the one proposed in

AnoWGAN. During the training, the generator learns to make realistic images from the latest space

vector while the trainee improves the actual separation of the artificial data. An essential feature of this

method is that our generator learns to produce samples as seen during training and new, invisible, but

33
still 'healthy' ones that fall into the data distribution model studied. This feature is vital in this project

because it may not have the training data covering all possible variations of images that will still be

considered unfamiliar.

Unsupervised manifold learning of healthy samples

For example, suppose we are given a training set of "healthy" Xt images that contain images x1, x2 

Xt, where x1 has a sealant with a width w1 = 10px, and x2 has a sealant with a width w2 = 15px. We

want our model to learn that the "healthy" width of a sealant is 10px ≤ w ≤ 15px and that if w <10px or

w> 15px a test image with such sealants in a strange way. This can be read because the discriminator

can see real pictures with different sealant widths of 10px ≤ w ≤ 15px. If the generator produces an

image with a sealant width outside of this width, the traveler can rightly guess that it was made and

retrieve this information. On the other hand, the generator learns to make images in this range, as it is

not the discriminator that separates the data. This allows us to have smaller training sets as long as we

have pictures describing what is considered "healthy" data. After that the model can work well with

invisible but "healthy" data.

Mapping to the latent space

Once the training is complete, generator G learns how to map G (z) = z  x from hidden to new

healthy and realistic images to produce many new and unique images. However, the opposite map 

(x) = x  z is not readily available. So to obtain such a z, we first take a random sample of z0 in the

distribution of the hidden Z and then produce G (z0). Then, in l = 0, ..., k steps, we reverse the loss

between the first image created by G (z0) and the test, making the next generated image G (zl + 1)

very similar to xtest, by updating the gradients zl coefficients, resulting in a fixed position in the

hidden area. After steps k, we found an image very similar to G (z) in the xtest test image. We used k =

34
500

35
steps, the same as in [31]. The function in [11] defines a loss function that includes only

reconstruction losses, while Schlegl. et al. [31] describes the loss xtest function of mapping  z using

two parts - reconstruction (remaining) 5.2 losses and discriminatory losses 5.3. The reconstruction

loss pushes the G (z) and xtest images to be more similar in appearance, while the discriminatory loss

compels the generated image G (z) to fall asleep to more readings. Therefore, both the bias and the

generator are used to adjust the z-coefficients by backpropagation.

Detecting anomalies

From the previous steps, we have trained the generator G to produce healthy images and after some

parameter optimization we have obtained a latent vector zk for a test image xtest. Now we need to

define a way to classify test images as anomalies. We do this by defining an Anomaly Score. This

score can be defined differently, and few of them have been explored and compared. Authors Schlegl

et. al in

[24] defines this score to be similar to the loss function they use (equation 5.4) and is defined like this:

A(x) = (1 − α) ⋅ R(x) + α ⋅ D(x

Where R(x) and D(x) are the reconstruction and the discriminative losses of the last iteration of the

mapping to the latent space procedure.

The rationale is that A(x) is high for the anomalous images and low for the similar ones, as a similar

image xt has been seen during the training and the model could find a z from the the latent space from

which a similar image xs  xt  xtest could be found. So, we need to define a threshold such that:

36
Where 1 is an anomaly and 0 is not.

Anomaly score

After our model can produce G(z) we want to compute the anomaly score A(x), which rationale is

explained in the subsection above. However, there are multiple alternative metrics we can choose

from to base our anomaly score on. Work in AnoGAN uses their loss function that combines mean

absolute error and the discriminative (based on features) losses, while WGAN uses only the generator

loss (mean absolute error). We use loss as defined in AnoGAN [31], ANOWGAN, discriminative loss,

and Mean absolute error in this work. After generating a similar image after fitting for the latent

variable z, we use a non-local fast denoising algorithm [7] to remove noise from the images. This

algorithm removes the noise and smoothens an image while keeping the structural information

untouched, like lines or curves. We do this to make them more consistent in the pixel values for the

metrics like the mean absolute error.

37
Chapter 5: Experiments and Results

Transfer Learning

In our experiments the transfer learning with all convolutional (reused) layers frozen did not work very

well in determining blood flow anomalies. When not freezing transferred layers and allowing back-

propagation, model learns fast . Freezing the first convolutional layers makes model worse than

without freezing (although converges to similar performance after a number of epochs), but is much

better than models with all layers frozen. A probable reason is that the actual features learned from

imagenet dataset images are very different in nature than the features defining defects in this domain.

However, in my understanding, the most basic features learned in first convolutional layers are still

similar (lines, curves, etc.) and because of that allowing model to fine-tune itself enables us to achieve

working model.

Autoencoder

Here are the results for the simple autoencoder approach of anomaly detection based on the

reconstruction loss (mean squared error). Anomaly score here is defined as explained in section 4.5 and

is based on the reconstruction loss (mean squared error) between test image and image that goes

through autoencoder. It involves testing the model on the training data, selecting the biggest anomaly

score (in this case - mean squared error), adding 5% margin and classifying an item as an anomaly if

its anomaly score is higher than this threshold. However, because these values (and classification as

well) are based on the threshold that is selected based on observations of the training, it is important to

also

38
look at how these values are distributed (for the test set). The loss (Mean squared error) distribution is

presented in the boxplot (Figure 5.2).

Figure 5.1 :Autoencoder’s anomaly detection confusion matrix

3
Figure 5.2: Autoencoder’s loss distribution
WGAN training

In the Figure 5.3 generated images G(z) at a different timing (no. of epochs) are shown. As we see,

network learns to generate healthy looking images without mode collapse (as there are positional and

brightness differences among the pictures) . In the figure 5.4 and 5.3 we plot the training loss of these

two modules. An important note is that we train the discriminator 6 times more than the generator,

hence the difference in the number of the iterations made. We can see that both losses tend to converge

to the similar values. In the case of a generator, where loss often is negative (because it is a WGAN),

the loss converges to have small values around 0. However, the image quality does increase even when

this balance has been reached in 400 iterations. The discriminator loss starts small but quickly jumps to

be very high, and with time it tends to decrease towards 0.

Figure 5.3 : WGAN generator Loss


3
Figure 5.4 WGAN Discriminator Loss

As we can see in the Figure 5.5 the GAN approach also performs good, for this dataset it achieves the

best result with F1 score of 0.95 when using only the discriminative loss as Anomaly score A(x). Using

combined (discriminative and residual losses with λ = 0.1), F1 score is also high - 0.92. In the Figure

5.6 we can see two confusion matrices that have been built by using different anomaly scores. In a)

the discrimination loss has been used as a metric for defining Anomaly score A(x) , while in b)

combination of discrimination and reconstruction (mean squared error) based.

3
ANO-WGAN

Hyper-parameters of AnoWGAN Hyper-parameters used for training and evaluation of our model on

the MRI Dataset dataset are mentioned in the following table.

Furthermore, training uses the following hyper-parameters.

• After every other epoch, we divide the learning rate of the generator and the critic by 1.65 and

1.25 respectively.

• The encoder learning rate is divided by 2 every four epochs.

• At the start of the training, the critic is pre-trained on 2000 training steps.

• Starting from the second epoch, the critic and the generator are pre-trained on 1000 and 30

training steps respectively.

• The number of critic iterations is increased by 2 every epoch.

• At the end of the last epoch, during normal data distribution learning, the generator is trained

for 10 more steps.

Hyperparameter value
Batch size 4
Batch size – encoder 32
Initial learning rate G 3.0 · 10−5
Initial learning rate D 10^−4
Initial learning rate E 10^−5
Initial critic iterations 7
Epochs 5
Epochs – encoder 6
Penalty coefficient λ 12
Mapping coefficient ζ 0.0
Dropout – generator 0.2

3
Hyperparameter value
Dropout – critic 0.3
Latent space dimensionality – 36
encoder
Latent space dimensionality – 12
mapping
Mapping iterations 70
Mapping learning rate 0.09
Adam first momentum β1 0.1
Adam second momentum β2 0.9

Figure 5.7 Generated set of Images by AnoWGAN

3
Chapter 6: Conclusion

This thesis focuses on anomaly detection, utilizing Generative adversarial networks. Our goal

was to study GAN-based anomaly detection methods and see if they can detect anomalous blood

clot points in MRI Aneureysm Dataset.

We have developed a model that uses the Wasserstein GANs to study the distribution of

common samples, as well as the code encoder (in the form of a deep neural network) to map

from the data space to the hidden location.

Autoencoder is also a respectable method, due to its durability, speed, simplicity and the ability

to perform incorrect localization, all in an unregulated manner. Through in-depth learning and

data augmentation we enable resilience to the point where models really learn important features

and become those that define rotation, colors, location, etc.

However, the solutions are not perfect and there is room for improvement in everything - better

feature rendering, image production and confusing discovery, as well as wrong local

performance. It should also be noted that this problem is directly related to the context. That

means that the way this particular data works cannot suit a different type of data. It is very

important to understand what variations exist in healthy data and what kind of confusion one can

expect.

4
Bibliography and Referenced Literatures

1. YiSheng Sun, Zhao Zhao, ZhangNv Yang, Fang Xu, HangJing Lu, ZhiYong Zhu, Wen
Shi, Jianmin Jiang, PingPing Yao, and HanPing Zhu. Risk factors and preventions of
breast cancer. International journal of biological sciences, 13(11):1387, 2017.

2. Rebecca L. Siegel, Kimberly D. Miller, and Ahmedin Jemal. Cancer statistics, 2020. CA:
A Cancer Journal for Clinicians, 70(1):7–30, 2020.

3.Ophira Ginsburg, Cheng Har Yip, Ari Brooks, Anna Cabanes, Maira Caleffi, Jorge
Antonio Dunstan Yataco, Bishal Gyawali, Valerie McCormack, Myrna McLaughlin de
Anderson, Ravi Mehrotra, et al. Breast cancer early detection: A phased approach to
implementation. Cancer, 126:2379–2393, 2020.

4. Arden Dertat. Applied deep learning - part 3: Autoencoders. 12

5.Diederik P. Kingma and Max Welling. Auto-encoding variational bayes. CoRR, abs/
1312.6114, 2013

6. Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley,
Sherjil Ozair, Aaron C. Courville, and Yoshua Bengio. Generative adversarial networks.
CoRR, abs/1406.2661, 2014.

7.Alec Radford, Luke Metz, and Soumith Chintala. Unsupervised representation learning
with deep convolutional generative adversarial networks. CoRR, abs/1511.06434, 2015. 2,
16, 24, 25, 38

8. V. Chandola, A. Banerjee, and V. Kumar, “Anomaly detection: A survey,” ACM


Computing Surveys, 2009. doi: 10.1145/1541880.1541882

9.F. T. Liu, K. M. Ting, and Z.-H. Zhou, “Isolation forest,” IEEE, 2008. doi: 10.1109/
ICDM.2008.17.

10. T. M. Mitchell, Machine Learning. McGraw-Hill, 1997, isbn: 978-0-07- 042807-2.

11. C. Cortes and V. Vapnik, “Support-vector networks,” Machine Learning, 1995. doi:

4
10.1007/BF00994018.

4
12. MAHALANOBIS, P. Ch. On the generalized distance in statistics. In: 1936.

13. ZHANG, Han; GOODFELLOW, Ian; METAXAS, Dimitris; ODENA, Augustus. Self-
attention generative adversarial networks. In: International conference on machine
learning. 2019, pp. 7354–7363

14. Mohamad Baydoun, Mahdyar Ravanbakhsh, Damian Campo, Pablo Marin, David
Martin, Lucio Marcenaro, Andrea Cavallaro, and Carlo S Regazzoni. A multi-perspective
approach to anomaly detection for self-aware embodied agents. In 2018 IEEE
International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages
6598–6602. IEEE, 2018.

15. Korosh Vatanparvar and Mohammad Abdullah Al Faruque. Self-secured control with
anomaly detection and recovery in automotive cyber-physical systems. In 2019 Design,
Automation & Test in Europe Conference & Exhibition (DATE), pages 788–793. IEEE,
2019.

16. Mahdyar Ravanbakhsh, Moin Nabi, Enver Sangineto, Lucio Marcenaro, Carlo
Regazzoni, and Nicu Sebe. Abnormal event detection in videos using generative
adversarial nets. In 2017 IEEE International Conference on Image Processing (ICIP),
pages 1577–1581. IEEE, 2017.

17. Samet Akcay, Amir Atapour Abarghouei, and Toby P Breckon. GANomaly: Semi-
supervised anomaly detection via adversarial training. In Asian conference on computer
vision, pages 622–637. Springer, 2018.

18. YuXing Tang, YouBao Tang, Mei Han, Jing Xiao, and Ronald M Summers. Deep
adversarial one-class learning for normal and abnormal chest radiograph classification. In
Medical Imaging 2019: Computer-Aided Diagnosis, volume 10950, page 1095018.
International Society for Optics and Photonics, 2019.

19. Mahmoud Mostapha, Juan Prieto, Veronica Murphy, Jessica Girault, Mark Foster, Ashley
Rumple, Joseph Blocher, Weili Lin, Jed Elison, John Gilmore, et al. Semi-supervised
VAE-GAN for out-of-sample detection applied to MRI quality control. In International
Conference on Medical Image Computing and Computer-Assisted Intervention, pages
127–136. Springer, 2019.

4
20. Mohammad Sabokrou, Masoud Pourreza, Mohsen Fayyaz, Rahim Entezari, Mahmood
Fathy, Jürgen Gall, and Ehsan Adeli. AVID: Adversarial visual irregularity detection. In
Asian Conference on Computer Vision, pages 488–505. Springer, 2018.

21. Alireza Makhzani, Jonathon Shlens, Navdeep Jaitly, Ian Goodfellow, and Brendan Frey. Adversarial
autoencoders. arXiv preprint arXiv:1511.05644, 2015.

22. B Ravi Kiran, Dilip Mathew Thomas, and Ranjith Parakkal. An overview of deep learning based
methods for unsupervised and semi-supervised anomaly detection in videos. arXiv preprint
arXiv:1801.03149, 2018.

23. Yuning Qiu, Teruhisa Misu, and Carlos Busso. Driving anomaly detection with
conditional generative adversarial network using physiological and CAN-Bus data. In
2019 International Conference on Multimodal Interaction, pages 164–173, 2019.

24. Wallace Lawson, Esube Bekele, and Keith Sullivan. Finding anomalies with generative
adversarial networks for a patrolbot. In Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition Workshops, pages 12–13, 2017

25. Flamm M, Diamond S. 2012 Multiscale systems biology and physics of thrombosis
underflow. Ann. Biomed. Eng. 40, 2355–2364. (doi:10.1007/s10439- 012-0557-9).

26. Hemostasis and thrombosis: basic principles and clinical practice, 5th edn. (Brief Article)
(Book Review). 2006. SciTech Book News.

27. Kai Jiang, Weiying Xie, Yunsong Li, Jie Lei, Gang He, and Qian Du. Semi-supervised
spectral learning with generative adversarial network for hyperspectral anomaly detection.
IEEE Transactions on Geoscience and Remote Sensing, 2020.

28. Haruna Watanabe, Ren Togo, Takahiro Ogawa, and Miki Haseyama. Bone metastatic
tumor detection based on AnoGAN using CT images. In 2019 IEEE 1st Global
Conference on Life Sciences and Technologies (LifeTech), pages 235–236. IEEE, 2019.

29. Samet Akcay, Amir Atapour Abarghouei, and Toby P Breckon. GANomaly: Semi-
supervised anomaly detection via adversarial training. In Asian conference on computer
vision, pages 622–637. Springer, 2018.

30. Mahmoud Mostapha, Juan Prieto, Veronica Murphy, Jessica Girault, Mark Foster, Ashley
Rumple, Joseph Blocher, Weili Lin, Jed Elison, John Gilmore, et al. Semi-supervised
VAE-GAN for out-of-sample detection applied to MRI quality control. In International

4
Conference on Medical Image Computing and Computer-Assisted Intervention, pages 127–
136. Springer, 2019.

31. Gao L, Pan H, Li Q, Xie X, Zhang Z, Han J, Zhai X. Brain medical image diagnosis based on
corners with importancevalues. BMC Bionform. 2017;18(1):1–13. https://doi.org/10.1186/
s12859-017-1903-6.

32.[Han et. al.] [MADGAN: unsupervised Medical Anomaly Detection GAN using multiple adjacent
brain MRI slice reconstruction]

33. ThomasSchlegl,PhilippSeeböck,SebastianMWaldstein,GeorgLangs,andUrsulaSchmidt-Erfurth.f-
AnoGAN: Fast unsupervised anomaly detection with generative adversarial networks. Medical
image analysis, 54:30–44, 2019.

34. Sungmin You, Baek Hwan Cho, Soonhyun Yook, Joo Young Kim, Young Min Shon, Dae
Won Seo, and In Young Kim. Unsupervised automatic seizure detection for focal-onset
seizures recorded with behind-the-ear EEG using an anomaly-detecting generative
adversarial network. Computer Methods and Programs in Biomedicine, page 105472,
2020.

35. Chengfen Zhang, Yue Wang, Xinyu Zhao, Yan Guo, Guotong Xie, Chuanfeng Lv, and
Bin Lv. Memory- augmented anomaly generative adversarial network for retinal OCT
images screening. In 2020 IEEE 17th International Symposium on Biomedical Imaging
(ISBI), pages 1971–1974. IEEE, 2020.

36. Yan Kuang, Tian Lan, Xueqiao Peng, Gati Elvis Selasi, Qiao Liu, and Junyi Zhang.
Unsupervised multi- discriminator generative adversarial network for lung nodule
malignancy classification. IEEE Access, 8:77725– 77734, 2020.

37. YuXing Tang, YouBao Tang, Mei Han, Jing Xiao, and Ronald M Summers. Abnormal
chest X-ray identification with generative adversarial one-class classifier. In 2019 IEEE
16th International Symposium on Biomedical Imaging (ISBI 2019), pages 1358–1361.
IEEE, 2019.

38. 38.

You might also like