Professional Documents
Culture Documents
Networks Approach
Thesis
Submitted in partial fulfillment of the requirements of
BITS F421T
Thesis By
Suyash Sharma
ID No. 2017B4A10627P
1
ACKNOWLEDGEMENTS
During The course of the thesis, I had continuous support, encouragement, and guidance from
I am extremely grateful to Dr. Banasri Roy (HoD, Chemical department) for encouragementand
motivation for the project ensuring us every support possible from her and the department
overall.
I am also thankful to all the Teaching staff of the Chemical department for supporting me in all
the other ways, thus helping us in successfully completing the project work. I would also like to
thank all the non-teaching staff for their contributions and support.
2
CERTIFICATE
This is to certify that the Thesis entitled, Anomaly Detection in blood flow surrounding an artery
clot using GAN based approach submitted by Suyash Sharma, ID No. 2017B4A10627P in partial
fulfillment of the requirement of BITS F421T Thesis embodies the work done by him under my
supervision.
3
ABSTRACT
of quality control systems that minimize the chance of missing a vital anomaly point. Anomaly
detection is often done by analyzing the patients' blood flow images. Because the medical images
Generative adversarial networks (GANs) can capture the distribution of their inputs. They are
then used to learn the distribution of regular data and then detect anomalies, even if they are
This Thesis describes a Generative Adversarial Network (GAN) based approach for blood clot
detection on colour Doppler ultrasound images of blood vessels. Given the challenges of
gathering haemorrhage data and the natural pathology variability, we study an unsupervised
anomaly detection network that learns a manifold of normal blood flow variability and identifies
4
TABLE OF CONTENTS
AKNOWLEDGMENT.............................................................................................2
ABSTRACT..............................................................................................................4
TABLE OF CONTENTS..........................................................................................5
Chapter 1:Introduction
1. Introduction...............................................................................................7
2. Research Objectives..................................................................................9
6. Transfer Learning....................................................................................12
7. Autoencoders..........................................................................................13
5
13. One class Support Vector Machine.........................................................19
2. Predicting Epilepsy.................................................................................20
3. MADGAN...............................................................................................21
2. AnoGAN...............................................................................................26
3. WGAN...................................................................................................27
4. AnoWGAN............................................................................................28
5. GANomaly............................................................................................30
Chapter 6: Conclusion.........................................................................................40
6
Chapter 1: Introduction
1. Introduction
In recent times, there has been a dramatic increase in the use of Deep Learning-based medical
applications in various applications, from receiving fraudulent sales to harmful brain plants.
Since bizarre behavior is often costly (or dangerous) to the system, it is not easy to collect
enough data to represent such behavior. Generational Adversarial Networks (GANs) have drawn
Many systems rely on and produce large amounts of data in today's society. This data is essential
in many decision-making processes related to these systems. Typically, systems operate under
standard conditions. However, in rare cases, anomalies are possible. Such abnormalities can have
a detrimental effect on the system itself or its environment. Therefore, it is essential to identify
the confounding factor in advance to minimize the impact. For example, cancer is an anomaly in
human tissue. Breast cancer is the second leading cause of death in women [1]. According to a
recent study by the American Cancer Society [2], breast cancer alone accounts for 30% of
women with cancer. Early detection and treatment of breast cancer can significantly increase the
The process of detecting anomalous behavior of a system is called anomaly detection. The main
purpose of anomaly detection is to distinguish between expected and unexpected behavior of the
system. Given the importance of the detection of anomaly, it has received widespread attention
7
in
8
research.
GANs are a type of unsupervised generative model that has received a lot of attention in the
research community. A well-trained GAN can produce real-time data by taking samples from
distributed data. GAN contains a generator and a discriminator model. Both models compete in a
two-player zero-sum game mode, repeatedly improving their data production and discrimination
capabilities.
GAN's ability to generate data makes it attractive in research to find ambiguities in two
perspectives. First, they can help generate ambiguous data points that are hard to find. Second,
they can be used to study the distribution of data for the normal operating system of the system
Haemorrhages are a severe threat to human life, and its timely and correct diagnosis and
treatment are of great importance. Multiple types of haemorrhage are distinguished depending on
the location and character of bleeding. The main division covers five subtypes: subdural,
the head. The model trained for each haemorrhage subtype is GAN based on the AnoGAN and
GANomaly architectures.
9
2. Research Objectives
A big problem in machine learning is the availability of data. Different approaches to generating
and growing datasets exist, but none are perfect. This thesis examines whether utilising
generative adversarial networks in the data generation process can improve anomaly detection
Conventional methods generally encounter many obstacles in modeling blood flow and detecting
anomalies within complex geometries as they lack the quality and quantity of pathology in
medical images.
The following tasks: • Detect Anomalies present in the doppler Ultrasound Images of patients.
• Investigate the applicability and efficiency of the SPH method for modeling thrombus
• Assess the influence of various parameters such as flow rate and vessel geometry on thrombus
The work described in this thesis mainly focuses on the GAN based anomaly detection in blood
flow in arteries or vessels with different flow and material properties. The thesis is structured as
10
follows:
Chapter 2 briefly provides literature and theoretical knowledge on Deep Learning and
computational methods used and developed by researchers in the past for anomaly detection.
Chapter 3 we take a look at the state of Anomaly detection and generative adversarial networks
Chapter 4 explains the implementation of multiple GAN based anomaly detection models and
Chapter 5 presents the results from testing each approach, including both the existing ones and
Chapter 6 concludes the findings of this thesis work by explaining impacts of various
11
Chapter 2: Background study and literature review
Artificial Neural networks is an algorithm that can solve numerous different tasks. Each network
consists of multiple layers, where each layer consists of multiple neurons. Each neuron takes various
inputs, combines them with its internal weights and outputs. Artificial neural network in itself is rather
a simple algorithm. Still, once combined with backpropagation and some sort of optimization
algorithm, often gradient descent, they can be compelling and effectively solve many problems.
Fig 2.1 A simple multi-layer perceptron neural network where blue circles represents
neurons in the input layer, yellow circles represent neurons in a hidden layer, and the
green circle represents a neuron in the output layer.
12
Convolutional Neural Networks
Convolutional neural networks are artificial neural networks that consist of three main layer types:
Convolutional Layers, Pooling Layers, and regular fully-connected layers (The same kind of layer used
in standard fully connected artificial neural networks). This is similar to the previously mentioned fully
connected artificial neural network but with a few key differences. One of the differences is that the
architectures explicitly assume that the inputs are images. This allows the network to encode specific
properties into the architecture, which makes the network more efficient to implement and much faster
Transfer learning
The idea behind transfer learning is that we can train a neural network model on some task and then
reuse the learned weights of the model (or parts of it) for a related but different job. These models learn
to detect objects in the object recognition tasks, but the classification and detection is done on the
features extracted through convolutional layers. These features are general, especially on the lower
level convolutional layers. Training these layers requires a lot of data, and the available pre-trained
models are trained on big datasets. The learned weights of the network convolutional layers that can
extract good features from an image can be reused in many other tasks (significantly lower level);
features like edges and curves are rather general among many different image tasks.
The whole process of transfer learning is illustrated in Figure 2.2. Standard practice in transfer learning
is to freeze some transferred layers. That means that the layer weights are not changed anymore during
the training process, in a way that we do not allow back-propagation on these layers. This is most
effective when the features learned are very similar and relevant across domains. Often transferred
layers are frozen to speed up the training process, reducing the number of computations to compute
13
Figure 2.2: Transfer learning - Upper part shows a possible architecture of a neural
network that is trained on Task A with input A. The part below in the figure
(separated by a dashed line), shows the architecture of the neural network used
for our task B. Note that the convolution blocks are reused, while the other
layers are changed.
new weights. Usually, convolutional layers would be frozen, to train only the fully-connected layers
that have been attached on top of the pre-trained model. However, in many situations, relevant features
are only partly related across different problems and are more similar in the first convolutional layers
than the latter ones. Sometimes no layers are frozen to allow the network to fine-tune the already pre-
Autoencoders
14
Autoencoders are neural networks that learn to encode data to efficiently and then reconstruct the
original data from such encoded models. The goal is to understand the weights of such an encode-
decode pipeline (see Fig. 2.3) so that the reconstruction of the encoded image has a minimal loss.
Figure 2.3: The autoencoder consists of the encoder and the decoder neural
networks that are connected to each other. The encoder takes data as an input (for
example an image) and learns to encode it to an efficient data representation.
Then the decoder network learns to reconstruct the image from that data
representation. The image has been taken from [5]
Moreover, this training is done in an unsupervised manner. Typically autoencoders are used for
dimensionality reduction (feature encoding or extraction), but recently have also been used in
GAN
15
Generative Adversarial Network is a framework for estimating probability distribution via an
adversarial process. As shown in Fig 2.4, it contains two main parts, a generator (G) model and a
discriminator (D) model, which are trained simultaneously. Both G and D are neural networks.
purpose of G is to capture the data distribution, while the goal of D is to estimate the probability that a
A fixed-length noise variable Z is applied for input to learn the generator’s distribution over data X.
The discriminator is trained to maximize the probability of assigning the correct label to generated
samples and samples from the dataset. At the same time, the generator is trained to minimize the
likelihood of the discriminator correctly setting the label of generated models. In general, the
framework is trained by playing a mini-max game with value function V (G, D):
early stage of training, when G is poor, D can reject samples from G with high confidence because
they are obviously different from samples from the training dataset.
Data augmentation, also known as oversampling, is carried out to compensate for insufficient data in
the dataset to prevent model overfitting. It can also be used to address the problem of data imbalance,
which occurs when the sizes of the classes in a dataset differ considerably. For instance, in a binary
classification task, the class with fewer samples is called the minority class, and the other class is called
the majority class. The corresponding training process would be biased towards the majority class.
Hence a classifier trained using this dataset would have better accuracy for this class [148]. To address
the imbalanced dataset problem, one can randomly remove samples from the majority class to balance
the class size (undersampling), or augment the minority class by adding artificially generated
The problem of the imbalanced dataset is more critical in anomaly detection since it is hard and
expensive to collect data on the anomalous behaviour of the system under study. Often, there are very
few or no examples of anomalous data available. GANs can help by generating more samples for the
17
The main goal of GANs is to learn a generative model that produces realistic-looking data by sampling
from the learned distribution. This abundant power of GANs was highlighted by Goodfellow et al. [6]
and Radford et al. [7]. Representation learning with GANs for anomaly detection uses the ability of
GANs in learning the distribution of a particular class of data. Several anomaly detection techniques
are proposed that use this representation learning ability of GANs. We will explain the concept of
anomaly detection techniques (AnoGAN). All other anomaly detection techniques that rely on model
learning through a GAN are variations to some extent of the AnoGAN technique.
Anomaly detection is topic which has been extensively studied. Methods can be categorized into three
[8]: supervised, unsupervised and semi-supervised. Supervised methods require annotated datasets to
be able to detect anomalies. The need for annotated data is usually unattainable since anomalies are
rare. Also, the dataset would be highly imbalanced. Unsupervised anomaly detection methods need
only normal instances during training. This is entirely feasible since most of the data in which one
wants to find outliers are not anomalies. Sometimes this discipline is called novelty detection because
it can be described as detecting novelties, which do not conform to the distribution of already
experienced examples. In the subsequent subsections, we describe papers about anomaly detection [8]
and some traditional machine learning methods that can be used for unsupervised anomaly detection.
Isolation Forest
18
Isolation Forest [9] is an unsupervised machine learning algorithm. It is similar to Decision Tree and its
ensemble version, Random Forest [10]. The idea of the Isolation tree is that anomalies require fewer
splits in the decision tree than standard instances. Isolation Forest is then an ensemble of such trees.
The anomaly score of a sample depends on the number of edges from the root to a node to which the
𝔼[h(x)
s(x, n) = 2 ]
c(n)
Where E [h(x)] is the average number of edges which x has to traverse from the root to a terminal
node during classification; c(n) is the average path length of unsuccessful search in a binary search
tree. The number of instances is n. The advantage of the Isolation Forest is its linear computational
One Class Support Vector Machines (OC-SVM) is an unsupervised machine learning algorithm based
on Support Vector Machines (SVM) [11] which, on the other hand, is supervised. It is used for binary
classification. The principle of SVM lies in finding a hyperplane separating the dataset into two
classes. When the partition between individual classes is non-linear, the use of some non-linear
function to transform data points is inevitable. However, when training OC-SVM, instead of separating
data points belonging to two classes, we are trying to find a hyperplane in n-dimensional space such
that training examples are on one side and all other points x R n lie on the opposite side. Anomaly
19
score is then
20
Chapter 3: Medical Applications of GAN based Anomaly Detection:
Autoencoders
Since their introduction in [25] as a method for pre-training deep neural networks, AEs have been
widely used for automatic feature learning [26]. Fig. 4 illustrates the structure of an AE. They are
symmetric, and the model is trained to reconstruct the input from a learned representation captured
at the center of the architecture. Formally, let there be N samples in the dataset, the current
information be x, and f and g denote the encoder and decoder networks, respectively. Then, the
z = f (x)
y = g(z)
An example of the use of GANs for medical confounding findings in Schlegl et al. [12], in which the
authors used the GAN framework to find anomalies in Optical Coherence Tomography (OCT). They
trained GAN to make standard OCT scanners using a z-dispersive distribution. Then a map compiler is
used for standard OCT scanning to z. Therefore, it should be possible to restore the same image when
you draw a map from the image to z using the connector and from z to the image using the generator. If
21
there is any confusion, the authors point out that there is a discrepancy in this translation and identify
This section summarizes some of the most popular in-depth reading structures presented to identify
abnormalities in endoscopies. Because endoscopy devices capture RGB data, CNN is pre-trained in
large-scale acquisition benchmarks such as Image-Net [13]. For example, in [14], the authors used
Xception [15] a well-trained CNN architecture [13] to identify lesions in endoscopy images.
Predicting Epilepsy
The predictive predicament problem can be considered an anomaly detection problem when machine
learning models are trained to distinguish between pre-ictal and interictal brain conditions, to identify
when the brain activity of a particular subject changes from a normal interictal state to a pre-ictal state
(abnormal state). As a pre-ictal condition is a pre-existing state of the brain, this condition is called
predicting seizures. We acknowledge that epileptic seizures predictors have a few exceptions
compared to all other uncommon acquisition application domains we discussed above; however, many
studies have performed this function as a function of finding anomalies [16, [17], [18], and that is why
we look at it here.
The GAN-based approach is shown in Fig. 3.1. The GAN model generator is able to integrate STFT
images that look realistic using an audio vector. The generated STFTs are passed on to the
discriminator, who makes false assertions. Once the generator is trained in the task of predicting
capture, the authors adopt a discriminatory network by adding two fully integrated layers. Trained to
do standard / unusual classification instead of real / fake type. Therefore, the proposed system
enhances the information not only by labeling EEG signals but also integrated samples that are not
MADGAN
Proposed an unsupervised medical anomaly detection generative adversarial network (MADGAN), a novel two-
step method using GAN-based multiple adjacent brain MRI slice reconstruction to detect brain anomalies at
23
Figure 3.2 :Proposed MADGAN architecture for the next 3-slice generation from
the input 3 256 × 176 brain MRI slices: 3-SA MADGAN has only 3 (red-
contoured) SA modules after convolution/deconvolution whereas 7- SA
MADGAN has 7 (red- and blue-contoured) SA modules. Similar to RGB
images, we concatenate adjacent 3 gray slices into 3 channels
Chest radiograph (chest X-ray, or CXR) is a radiological examination that is often requested
abnormalities. It is also widely used in the prevention and evaluation of lung cancer. A timely
report by a radiologist of all images is desirable, but it is not always possible due to the heavy
load. As a result, an automated system of abnormal CXR classification may be useful, allowing
reporting activities that focus more on the pathological analysis of abnormal CXRs.The proposed
(GANs).
24
Figure 3.3 Framework of the proposed deep adversarial one-class
Three essential modules, namely, a U-Net [23] like autoencoder (generator), a convolutional
neural network (discriminator) and an encoder network, together constitute the generative
adversarial one-class learning architecture (See Figure ). The U-Net like autoencoder (denoted as
U) first maps an input CXR image xi T with Gaussian noise µ into a lower-dimensional latent
space z using a fully convolutional network (1 st encoder UE ), which is then inversely mapped
back using a deconvolutional network (decoder UD) to generate the reconstructed image x T .
The U-Net like encoder-decoder with skip connections is adopted to preserve high-resolution
discriminator (denoted as D) is looped for adversarial training to produce better and more
25
realistic reconstruction.
A second encoder E is padded after the autoencoder, which further encodes the generated fake
image into another latent space z0 , in order to force the consistency between two latent vectors z
and z0 and corresponding intermediate feature maps from the two encoders.
Variational Auto-Encoder (VAE) models [20] and related models, such as the Adversarial AutoEncoder
(AAE) [21] have been used successfully to measure high-resolution image distribution and image
capture [22]. They are particularly interesting for the unsupervised detection of abnormal lesions
because they allow for close proximity to a given image in relation to the distribution in which they are
trained. In their default embedded structure, the given image image is first coded and then decoded, i.e.
rebuilt. In both the hidden submissions, it is assumed that the external material is separated from the
normal data samples, and the reconstruction is investigated for external detection [22] in different
They studied the detection of brain tumors in an uncontrolled manner by studying the distribution of
brain MRI data to healthy subjects using automotive-based methods. It is considered that one of the
main drawbacks of current models is the lack of compliance in the subtle representation and proposed
a simple but effective limit that facilitates the recording of a wound image close to its healthy
26
Chapter 4: Architecture and Analysis of Multiple Anomaly Detection GAN
Models
4.1 Dataset
MRI Data from patients with cerebral aneurysms were collected for diagnostic and therapeutic
purposes.
Image data was obtained using the AXIOM Artis C-arm digital output system using a rotating
detection time of 5 seconds and 126 frames (190 ° or 1.5 ° per frame, 1024 x 1024-pixel matrix,
126 frames). Background processing was performed using LEONARDO InSpace 3D. A different
agent (Imeron 300, Bracco Imaging Deutschland GmbH, Germany) was inserted manually into
the internal carotid (anterior aneurysms) or vertebral (posterior aneurysms) of the artery.
Reconstruction of the volume of interest selected by the surgeon produced a series of ~ 220
images of fragments with 256x256 voxel matrices on the plane, resulting in an iso-voxel size of
~ 0.5 mm.
(frequency) or MRI scans. In standard CT or MRI scans but with suspicious symptoms of
points to a fractured aneurysm. In the absence of SAH asymmetry, where proximal or very large
or, abnormally shaped aneurysm is considered degenerative. In rare cases, the xanthochrom
parenchymal "halo" surrounding the aneurysm seen in surgery confirms the history of bleeding
27
4.2 AnoGAN
Schlegl et al. [2] introduced a new architecture using competing production networks. The principle is
found in various studies, in which the normal conditions (images of the eye retina in this article) are
dormant. As the anomalies lie in this diversity, it is possible to differentiate. GANs present points from
a hidden location to a data space but not the other way around.
To solve this problem, the authors propose a way to map from the data space to a hidden location.
When the data point x is given, the random point z points from the hidden space are repeatedly
adjusted to minimize the difference between x and G (z). Say z after γ repeated as zγ. The method,
called feature comparisons, was first proposed by Salimans et al. [27], is used to force a mapped point
to lie on most of the readings. Therefore the LD loss of the discriminator used to map the hidden area
is as follows:
LD (zγ) =
∑ f(x) − f G (zγ)
(
28
where f (·) is the result of a middle ground layer of discrimination. The middle layer is the last layer
but one in the calculation graph of the neural network. The Discriminator Loss forces the model
included in map x to to lie on the learned manifold. These losses are combined with residual losses that
measure similarities between the generated events and the actual data:
LR (zγ) = ∑ x − G (zγ)
4.3 WGAN
Wasserstein range
Wasserstein Range (Earth Mover Range) is a matrix distance between the distribution of two
opportunities in a given metric area. Understandably, it can be seen as a small task required to
transform one distribution into another. The function is defined as the product of the mass distribution
• JS Divergence fills the space as the racist gets better; thus, the gradients become zero
and disappear.
29
• The Wasserstein range is a logical matrix, i.e., it combines 0 as the distribution approaches
• Wasserstein Distance as an objective function is more stable than using JS variants. The
folding problem mode is also minimized when using the Wasserstein range as an objective
function.
Now that it has become clear that setting up Wasserstein Distance makes more sense than developing
JS Divergence, it should be noted that the Wasserstein Distance described in the above Equation
cannot be overstated.
WGAN introduces the critic instead of the racism we know of GAN. The critical network is similar to
the construction of a racist network but predicts the Wasserstein range in preparation for the acquisition
of w *, increasing the Equation below. To that end, the critic's mission is as follows:
The difference between the discriminator and the critic is that the discriminator is trained to identify
samples from P_r to representatives from P_g accurately. The critic estimates the distance Wasserstein
4.4 AnoWGAN
The model uses Improved WGAN to learn the manifold of normal samples. Generator changes its
Critic, on the other hand, changes its parameters to minimize its output on the generated instance and at
the same time maximize the output on real instances. Consequently, because we are training WGAN
30
with gradient penalty, we need to add the correction to the critic’s loss:
31
LC = min 𝔼z∼pz[D(G (z))] − 𝔼x∼p̂ [D(x)] + 2
data
λ𝔼x∼ ∇x̂D(x̂) − 1)
D
px̂
[( 2 ]
To detect anomalies of one data point x, we need to find its image z in latent space. We have two
versions of our model to see the image in latent space. The first is similar to an organ, it utilizes
AnoWGAN makes use of another neural network, encoder. The difference between our model and the
one proposed by Zenati et al. [23] is that we do not train the encoder concurrently with the generator
and critic. The encoder is trained by minimizing the absolute difference between instance x actual
data and its projection back to the space of real data but through the encoder and already trained
generator:
4.4 GANomaly
The generator network includes the components in sequence, the GE encoder, the GD codec (this part
includes the autoencoder architecture) and the third part is another codec Encoder.
part and generator are the building blocks of a standard GAN architecture.
• Real data events - used as good samples during training and these are real pictures of MRI
Scans
• Fake data conditions - produced and used by the discriminator as negative examples during
training.
The main purpose of discrimination is to distinguish between actual and produced data and properly
trained. if it can't see the difference then the generator produces real images. The generator is
Loss of generator: The objective function is built using three losses of Minimax Loss, Modified
Minimax Los, Wasserstein Loss. Every loss tries to improve a different part of the whole architecture.
To learn the healthy data distribution, we train GAN with the same structure as the one proposed in
AnoWGAN. During the training, the generator learns to make realistic images from the latest space
vector while the trainee improves the actual separation of the artificial data. An essential feature of this
method is that our generator learns to produce samples as seen during training and new, invisible, but
33
still 'healthy' ones that fall into the data distribution model studied. This feature is vital in this project
because it may not have the training data covering all possible variations of images that will still be
considered unfamiliar.
For example, suppose we are given a training set of "healthy" Xt images that contain images x1, x2
Xt, where x1 has a sealant with a width w1 = 10px, and x2 has a sealant with a width w2 = 15px. We
want our model to learn that the "healthy" width of a sealant is 10px ≤ w ≤ 15px and that if w <10px or
w> 15px a test image with such sealants in a strange way. This can be read because the discriminator
can see real pictures with different sealant widths of 10px ≤ w ≤ 15px. If the generator produces an
image with a sealant width outside of this width, the traveler can rightly guess that it was made and
retrieve this information. On the other hand, the generator learns to make images in this range, as it is
not the discriminator that separates the data. This allows us to have smaller training sets as long as we
have pictures describing what is considered "healthy" data. After that the model can work well with
Once the training is complete, generator G learns how to map G (z) = z x from hidden to new
healthy and realistic images to produce many new and unique images. However, the opposite map
(x) = x z is not readily available. So to obtain such a z, we first take a random sample of z0 in the
distribution of the hidden Z and then produce G (z0). Then, in l = 0, ..., k steps, we reverse the loss
between the first image created by G (z0) and the test, making the next generated image G (zl + 1)
very similar to xtest, by updating the gradients zl coefficients, resulting in a fixed position in the
hidden area. After steps k, we found an image very similar to G (z) in the xtest test image. We used k =
34
500
35
steps, the same as in [31]. The function in [11] defines a loss function that includes only
reconstruction losses, while Schlegl. et al. [31] describes the loss xtest function of mapping z using
two parts - reconstruction (remaining) 5.2 losses and discriminatory losses 5.3. The reconstruction
loss pushes the G (z) and xtest images to be more similar in appearance, while the discriminatory loss
compels the generated image G (z) to fall asleep to more readings. Therefore, both the bias and the
Detecting anomalies
From the previous steps, we have trained the generator G to produce healthy images and after some
parameter optimization we have obtained a latent vector zk for a test image xtest. Now we need to
define a way to classify test images as anomalies. We do this by defining an Anomaly Score. This
score can be defined differently, and few of them have been explored and compared. Authors Schlegl
et. al in
[24] defines this score to be similar to the loss function they use (equation 5.4) and is defined like this:
Where R(x) and D(x) are the reconstruction and the discriminative losses of the last iteration of the
The rationale is that A(x) is high for the anomalous images and low for the similar ones, as a similar
image xt has been seen during the training and the model could find a z from the the latent space from
which a similar image xs xt xtest could be found. So, we need to define a threshold such that:
36
Where 1 is an anomaly and 0 is not.
Anomaly score
After our model can produce G(z) we want to compute the anomaly score A(x), which rationale is
explained in the subsection above. However, there are multiple alternative metrics we can choose
from to base our anomaly score on. Work in AnoGAN uses their loss function that combines mean
absolute error and the discriminative (based on features) losses, while WGAN uses only the generator
loss (mean absolute error). We use loss as defined in AnoGAN [31], ANOWGAN, discriminative loss,
and Mean absolute error in this work. After generating a similar image after fitting for the latent
variable z, we use a non-local fast denoising algorithm [7] to remove noise from the images. This
algorithm removes the noise and smoothens an image while keeping the structural information
untouched, like lines or curves. We do this to make them more consistent in the pixel values for the
37
Chapter 5: Experiments and Results
Transfer Learning
In our experiments the transfer learning with all convolutional (reused) layers frozen did not work very
well in determining blood flow anomalies. When not freezing transferred layers and allowing back-
propagation, model learns fast . Freezing the first convolutional layers makes model worse than
without freezing (although converges to similar performance after a number of epochs), but is much
better than models with all layers frozen. A probable reason is that the actual features learned from
imagenet dataset images are very different in nature than the features defining defects in this domain.
However, in my understanding, the most basic features learned in first convolutional layers are still
similar (lines, curves, etc.) and because of that allowing model to fine-tune itself enables us to achieve
working model.
Autoencoder
Here are the results for the simple autoencoder approach of anomaly detection based on the
reconstruction loss (mean squared error). Anomaly score here is defined as explained in section 4.5 and
is based on the reconstruction loss (mean squared error) between test image and image that goes
through autoencoder. It involves testing the model on the training data, selecting the biggest anomaly
score (in this case - mean squared error), adding 5% margin and classifying an item as an anomaly if
its anomaly score is higher than this threshold. However, because these values (and classification as
well) are based on the threshold that is selected based on observations of the training, it is important to
also
38
look at how these values are distributed (for the test set). The loss (Mean squared error) distribution is
3
Figure 5.2: Autoencoder’s loss distribution
WGAN training
In the Figure 5.3 generated images G(z) at a different timing (no. of epochs) are shown. As we see,
network learns to generate healthy looking images without mode collapse (as there are positional and
brightness differences among the pictures) . In the figure 5.4 and 5.3 we plot the training loss of these
two modules. An important note is that we train the discriminator 6 times more than the generator,
hence the difference in the number of the iterations made. We can see that both losses tend to converge
to the similar values. In the case of a generator, where loss often is negative (because it is a WGAN),
the loss converges to have small values around 0. However, the image quality does increase even when
this balance has been reached in 400 iterations. The discriminator loss starts small but quickly jumps to
As we can see in the Figure 5.5 the GAN approach also performs good, for this dataset it achieves the
best result with F1 score of 0.95 when using only the discriminative loss as Anomaly score A(x). Using
combined (discriminative and residual losses with λ = 0.1), F1 score is also high - 0.92. In the Figure
5.6 we can see two confusion matrices that have been built by using different anomaly scores. In a)
the discrimination loss has been used as a metric for defining Anomaly score A(x) , while in b)
3
ANO-WGAN
Hyper-parameters of AnoWGAN Hyper-parameters used for training and evaluation of our model on
• After every other epoch, we divide the learning rate of the generator and the critic by 1.65 and
1.25 respectively.
• At the start of the training, the critic is pre-trained on 2000 training steps.
• Starting from the second epoch, the critic and the generator are pre-trained on 1000 and 30
• At the end of the last epoch, during normal data distribution learning, the generator is trained
Hyperparameter value
Batch size 4
Batch size – encoder 32
Initial learning rate G 3.0 · 10−5
Initial learning rate D 10^−4
Initial learning rate E 10^−5
Initial critic iterations 7
Epochs 5
Epochs – encoder 6
Penalty coefficient λ 12
Mapping coefficient ζ 0.0
Dropout – generator 0.2
3
Hyperparameter value
Dropout – critic 0.3
Latent space dimensionality – 36
encoder
Latent space dimensionality – 12
mapping
Mapping iterations 70
Mapping learning rate 0.09
Adam first momentum β1 0.1
Adam second momentum β2 0.9
3
Chapter 6: Conclusion
This thesis focuses on anomaly detection, utilizing Generative adversarial networks. Our goal
was to study GAN-based anomaly detection methods and see if they can detect anomalous blood
We have developed a model that uses the Wasserstein GANs to study the distribution of
common samples, as well as the code encoder (in the form of a deep neural network) to map
Autoencoder is also a respectable method, due to its durability, speed, simplicity and the ability
to perform incorrect localization, all in an unregulated manner. Through in-depth learning and
data augmentation we enable resilience to the point where models really learn important features
However, the solutions are not perfect and there is room for improvement in everything - better
feature rendering, image production and confusing discovery, as well as wrong local
performance. It should also be noted that this problem is directly related to the context. That
means that the way this particular data works cannot suit a different type of data. It is very
important to understand what variations exist in healthy data and what kind of confusion one can
expect.
4
Bibliography and Referenced Literatures
1. YiSheng Sun, Zhao Zhao, ZhangNv Yang, Fang Xu, HangJing Lu, ZhiYong Zhu, Wen
Shi, Jianmin Jiang, PingPing Yao, and HanPing Zhu. Risk factors and preventions of
breast cancer. International journal of biological sciences, 13(11):1387, 2017.
2. Rebecca L. Siegel, Kimberly D. Miller, and Ahmedin Jemal. Cancer statistics, 2020. CA:
A Cancer Journal for Clinicians, 70(1):7–30, 2020.
3.Ophira Ginsburg, Cheng Har Yip, Ari Brooks, Anna Cabanes, Maira Caleffi, Jorge
Antonio Dunstan Yataco, Bishal Gyawali, Valerie McCormack, Myrna McLaughlin de
Anderson, Ravi Mehrotra, et al. Breast cancer early detection: A phased approach to
implementation. Cancer, 126:2379–2393, 2020.
5.Diederik P. Kingma and Max Welling. Auto-encoding variational bayes. CoRR, abs/
1312.6114, 2013
6. Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley,
Sherjil Ozair, Aaron C. Courville, and Yoshua Bengio. Generative adversarial networks.
CoRR, abs/1406.2661, 2014.
7.Alec Radford, Luke Metz, and Soumith Chintala. Unsupervised representation learning
with deep convolutional generative adversarial networks. CoRR, abs/1511.06434, 2015. 2,
16, 24, 25, 38
9.F. T. Liu, K. M. Ting, and Z.-H. Zhou, “Isolation forest,” IEEE, 2008. doi: 10.1109/
ICDM.2008.17.
11. C. Cortes and V. Vapnik, “Support-vector networks,” Machine Learning, 1995. doi:
4
10.1007/BF00994018.
4
12. MAHALANOBIS, P. Ch. On the generalized distance in statistics. In: 1936.
13. ZHANG, Han; GOODFELLOW, Ian; METAXAS, Dimitris; ODENA, Augustus. Self-
attention generative adversarial networks. In: International conference on machine
learning. 2019, pp. 7354–7363
14. Mohamad Baydoun, Mahdyar Ravanbakhsh, Damian Campo, Pablo Marin, David
Martin, Lucio Marcenaro, Andrea Cavallaro, and Carlo S Regazzoni. A multi-perspective
approach to anomaly detection for self-aware embodied agents. In 2018 IEEE
International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages
6598–6602. IEEE, 2018.
15. Korosh Vatanparvar and Mohammad Abdullah Al Faruque. Self-secured control with
anomaly detection and recovery in automotive cyber-physical systems. In 2019 Design,
Automation & Test in Europe Conference & Exhibition (DATE), pages 788–793. IEEE,
2019.
16. Mahdyar Ravanbakhsh, Moin Nabi, Enver Sangineto, Lucio Marcenaro, Carlo
Regazzoni, and Nicu Sebe. Abnormal event detection in videos using generative
adversarial nets. In 2017 IEEE International Conference on Image Processing (ICIP),
pages 1577–1581. IEEE, 2017.
17. Samet Akcay, Amir Atapour Abarghouei, and Toby P Breckon. GANomaly: Semi-
supervised anomaly detection via adversarial training. In Asian conference on computer
vision, pages 622–637. Springer, 2018.
18. YuXing Tang, YouBao Tang, Mei Han, Jing Xiao, and Ronald M Summers. Deep
adversarial one-class learning for normal and abnormal chest radiograph classification. In
Medical Imaging 2019: Computer-Aided Diagnosis, volume 10950, page 1095018.
International Society for Optics and Photonics, 2019.
19. Mahmoud Mostapha, Juan Prieto, Veronica Murphy, Jessica Girault, Mark Foster, Ashley
Rumple, Joseph Blocher, Weili Lin, Jed Elison, John Gilmore, et al. Semi-supervised
VAE-GAN for out-of-sample detection applied to MRI quality control. In International
Conference on Medical Image Computing and Computer-Assisted Intervention, pages
127–136. Springer, 2019.
4
20. Mohammad Sabokrou, Masoud Pourreza, Mohsen Fayyaz, Rahim Entezari, Mahmood
Fathy, Jürgen Gall, and Ehsan Adeli. AVID: Adversarial visual irregularity detection. In
Asian Conference on Computer Vision, pages 488–505. Springer, 2018.
21. Alireza Makhzani, Jonathon Shlens, Navdeep Jaitly, Ian Goodfellow, and Brendan Frey. Adversarial
autoencoders. arXiv preprint arXiv:1511.05644, 2015.
22. B Ravi Kiran, Dilip Mathew Thomas, and Ranjith Parakkal. An overview of deep learning based
methods for unsupervised and semi-supervised anomaly detection in videos. arXiv preprint
arXiv:1801.03149, 2018.
23. Yuning Qiu, Teruhisa Misu, and Carlos Busso. Driving anomaly detection with
conditional generative adversarial network using physiological and CAN-Bus data. In
2019 International Conference on Multimodal Interaction, pages 164–173, 2019.
24. Wallace Lawson, Esube Bekele, and Keith Sullivan. Finding anomalies with generative
adversarial networks for a patrolbot. In Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition Workshops, pages 12–13, 2017
25. Flamm M, Diamond S. 2012 Multiscale systems biology and physics of thrombosis
underflow. Ann. Biomed. Eng. 40, 2355–2364. (doi:10.1007/s10439- 012-0557-9).
26. Hemostasis and thrombosis: basic principles and clinical practice, 5th edn. (Brief Article)
(Book Review). 2006. SciTech Book News.
27. Kai Jiang, Weiying Xie, Yunsong Li, Jie Lei, Gang He, and Qian Du. Semi-supervised
spectral learning with generative adversarial network for hyperspectral anomaly detection.
IEEE Transactions on Geoscience and Remote Sensing, 2020.
28. Haruna Watanabe, Ren Togo, Takahiro Ogawa, and Miki Haseyama. Bone metastatic
tumor detection based on AnoGAN using CT images. In 2019 IEEE 1st Global
Conference on Life Sciences and Technologies (LifeTech), pages 235–236. IEEE, 2019.
29. Samet Akcay, Amir Atapour Abarghouei, and Toby P Breckon. GANomaly: Semi-
supervised anomaly detection via adversarial training. In Asian conference on computer
vision, pages 622–637. Springer, 2018.
30. Mahmoud Mostapha, Juan Prieto, Veronica Murphy, Jessica Girault, Mark Foster, Ashley
Rumple, Joseph Blocher, Weili Lin, Jed Elison, John Gilmore, et al. Semi-supervised
VAE-GAN for out-of-sample detection applied to MRI quality control. In International
4
Conference on Medical Image Computing and Computer-Assisted Intervention, pages 127–
136. Springer, 2019.
31. Gao L, Pan H, Li Q, Xie X, Zhang Z, Han J, Zhai X. Brain medical image diagnosis based on
corners with importancevalues. BMC Bionform. 2017;18(1):1–13. https://doi.org/10.1186/
s12859-017-1903-6.
32.[Han et. al.] [MADGAN: unsupervised Medical Anomaly Detection GAN using multiple adjacent
brain MRI slice reconstruction]
33. ThomasSchlegl,PhilippSeeböck,SebastianMWaldstein,GeorgLangs,andUrsulaSchmidt-Erfurth.f-
AnoGAN: Fast unsupervised anomaly detection with generative adversarial networks. Medical
image analysis, 54:30–44, 2019.
34. Sungmin You, Baek Hwan Cho, Soonhyun Yook, Joo Young Kim, Young Min Shon, Dae
Won Seo, and In Young Kim. Unsupervised automatic seizure detection for focal-onset
seizures recorded with behind-the-ear EEG using an anomaly-detecting generative
adversarial network. Computer Methods and Programs in Biomedicine, page 105472,
2020.
35. Chengfen Zhang, Yue Wang, Xinyu Zhao, Yan Guo, Guotong Xie, Chuanfeng Lv, and
Bin Lv. Memory- augmented anomaly generative adversarial network for retinal OCT
images screening. In 2020 IEEE 17th International Symposium on Biomedical Imaging
(ISBI), pages 1971–1974. IEEE, 2020.
36. Yan Kuang, Tian Lan, Xueqiao Peng, Gati Elvis Selasi, Qiao Liu, and Junyi Zhang.
Unsupervised multi- discriminator generative adversarial network for lung nodule
malignancy classification. IEEE Access, 8:77725– 77734, 2020.
37. YuXing Tang, YouBao Tang, Mei Han, Jing Xiao, and Ronald M Summers. Abnormal
chest X-ray identification with generative adversarial one-class classifier. In 2019 IEEE
16th International Symposium on Biomedical Imaging (ISBI 2019), pages 1358–1361.
IEEE, 2019.
38. 38.