Problem 2 Proposal

Geek-AI-Mania ‘18
2018
[College Details]
[Member 1] – [Role]
[Member 2] – [Role]
Name Document Revision Date Comments

Geek-AI-Mania ‘18
2018
Contents
1.Abstract ...................................................................................................................................................... 3
2.Problem Statement .................................................................................................................................... 3
3.Background Search .................................................................................................................................... 3
4.Approach to the Solution ........................................................................................................................... 3
5.Solution Description................................................................................................................................... 3
6.Technology Stack & Architecture (Software and Hardware) ..................................................................... 3
7.Experimentations & Results ....................................................................................................................... 3
8.Working Model & Accuracy Test Result..................................................................................................... 3
9.Future Scope of Solution............................................................................................................................ 4
10.Conclusion ................................................................................................................................................ 4
11.Appendix (as applicable) .......................................................................................................................... 4
Geek-AI-Mania ‘18
2018
1. Abstract
In this work, we have presented an Anomaly detection model on the of human movements from
the pedestrian walkways. The functional problem tackled is the identification of non-pedestrian
entities in the walkways and the some commonly occurring anomalies include bikers, skaters, small
carts, and people walking across a walkway from the real-world video footage captured by
Stationary camera mounted at an elevation, overlooking pedestrian walkway. In this work, we have
employed Tensorflow package where Faster R-CNN model is used to train the dataset. Once the
model is trained, testing of images can takes palce. During the testing process, whenever a given
test image violates the model, the anomaly object will be detected. The deep learning based R-CNN
model developed is able to detect the trained objects in real-world scenarios with high confidence,
and the ratio between the detected objects and desired objects is almost equivalent.
2. Problem Statement
Anomalies in videos are broadly defined as events that are unusual and signify irregular behavior.
Consequently, anomaly detection has broad applications in many different areas, including
surveillance, intrusion detection, health monitoring, and event detection. Unusual events of
interest in long video sequences, e.g. surveillance footage, often have an extremely low
probability of occurring. As such, manually detecting these rare events, or anomalies, is a very
meticulous task that often requires more manpower than is generally available. This has prompted
the need for automated detection and segmentation of sequences of interest [1]-[15]. According
to the recent attention to safety, a large number of cameras are installed in many places, and they
are recording very long time video data. But, in current surveillance systems, it depends on
manual operation to search abnormal behavior persons in long-duration recordings. So, actually, it
is difficult to detect anomalies by many cameras. And people with abnormal behavior have not
specific but various features, which include roaming, looking around, and diverting salesclerk’s
attention. That is to say, to discover anomalies, a surveillance system should detect persons
whose behaviors are deviant from general patterns.
Geek-AI-Mania ‘18
2018
3. Background Search
Deep learning is the new big trend in machine learning. It had many recent successes in computer
vision, automatic speech recognition and natural language processing.
Classification using a machine learning algorithm has 2 phases:
 Training phase: In this phase, we train a machine learning algorithm using a dataset
comprised of the images and their corresponding labels.
 Prediction phase: In this phase, we utilize the trained model to predict labels of unseen
images.
The training phase for an image classification problem has 2 main steps:
 Feature Extraction: In this phase, we utilize domain knowledge to extract new features that will
be used by the machine learning algorithm. HoG and SIFT are examples of features used in
image classification.
 Model Training: In this phase, we utilize a clean dataset composed of the images' features and
the corresponding labels to train the machine learning model.
In the prediction phase, we apply the same feature extraction process to the new images and we
pass the features to the trained machine learning algorithm to predict the label.
The main difference between traditional machine learning and deep learning algorithms is in the
feature engineering. In traditional machine learning algorithms, we need to hand-craft the features.
By contrast, in deep learning algorithms feature engineering is done automatically by the algorithm.
Feature engineering is difficult, time-consuming and requires domain expertise. The promise of
deep learning is more accurate machine learning algorithms compared to traditional machine
learning with less or no feature engineering.
Geek-AI-Mania ‘18
2018
Fig. 1. Machine learning phase
Fig. 2. Deep learning phase
4. Approach to the Solution

Computer vision is an interdisciplinary field that has been gaining huge amounts of traction in the
recent years (since CNN) and self-driving cars have taken centre stage. Another integral part of
computer vision is object detection. Object detection aids in pose estimation, vehicle detection,
surveillance etc. The difference between object detection algorithms and classification algorithms is
that in detection algorithms, we try to draw a bounding box around the object of interest to locate it
within the image. Also, you might not necessarily draw just one bounding box in an object detection
case, there could be many bounding boxes representing different objects of interest within the
image and you would not know how many beforehand.
Geek-AI-Mania ‘18
2018
The major reason why you cannot proceed with this problem by building a standard convolutional
network followed by a fully connected layer is that, the length of the output layer is variable — not
constant, this is because the number of occurrences of the objects of interest is not fixed. A naive
approach to solve this problem would be to take different regions of interest from the image, and
use a CNN to classify the presence of the object within that region. The problem with this approach
is that the objects of interest might have different spatial locations within the image and different
aspect ratios. Hence, you would have to select a huge number of regions and this could
computationally blow up. Therefore, algorithms like R-CNN, YOLO etc have been developed to find
these occurrences and find them fast.
Problems with R-CNN

 It still takes a huge amount of time to train the network as you would have to classify 2000 region
proposals per image.

 It cannot be implemented real time as it takes around 47 seconds for each test image.
 The selective search algorithm is a fixed algorithm. Therefore, no learning is happening at that
stage. This could lead to the generation of bad candidate region proposals.
Fast R-CNN
Fig. 3. Fast R-CNN
Faster (R-CNN) solved some of the drawbacks of R-CNN to build a faster object detection algorithm
and it was called Fast R-CNN. The approach is similar to the R-CNN algorithm. But, instead of feeding
the region proposals to the CNN, we feed the input image to the CNN to generate a convolutional
feature map. From the convolutional feature map, we identify the region of proposals and warp
Geek-AI-Mania ‘18
2018
them into squares and by using a RoI pooling layer we reshape them into a fixed size so that it can
be fed into a fully connected layer. From the RoI feature vector, we use a softmax layer to predict
the class of the proposed region and also the offset values for the bounding box.
The reason “Fast R-CNN” is faster than R-CNN is because you don’t have to feed 2000 region
proposals to the convolutional neural network every time. Instead, the convolution operation is
done only once per image and a feature map is generated from it.
Fig. 4.Comparison of object detection algorithms
From the above graphs, we can infer that Fast R-CNN is significantly faster in training and testing
sessions over R-CNN.
Fig. 5. Faster R-CNN

Geek-AI-Mania ‘18
2018
5. Solution Description
The main component of our system comprised a training component and a detection algorithm
running SSD. SSD is compute-intensive, but has been more optimized for Intel® architecture. We
adopted Caffe* optimized on Intel architecture as our Deep Learning Frameworks and the
hardware is an Intel Xeon Gold processor.
In this work, the entire solution is divided into three stages:
A. Dataset preparation
B. Network topology and model training
C. Inferencing
6. Technology Stack & Architecture (Software and Hardware)

1. Python
2. TensorFlow- Package
3. Faster R-CNN- Method
4. I7 Processor, NVIDIA 8 GB RAM (1080 Ti), 16 Gb RAM
7. Experimentations & Results

The following detection was obtained when the inference use-case was run on
below sample images.
Geek-AI-Mania ‘18
2018
8. Working Model & Accuracy Test Result
Image Resizer
Training Images
Geek-AI-Mania ‘18
2018
Class id Generator
CNN Based Predictor
9. Future Scope of Solution

This proposed model can be employed to detect anomalies in various real time applications like
surveillance, intrusion detection, health monitoring, and event detection
Geek-AI-Mania ‘18
2018
10. Conclusion
The proposed work has been successfully implemented using TensorFlow framework with Python
Programming. The usage of Fast RCN leads to faster and accurate results. During the testing
process, found that the proposed work identifies the anomalies in a significant way.
11. Appendix (as applicable)

----

Problem 2 Proposal

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Problem 2 Proposal

Uploaded by

Copyright:

Available Formats

Geek-AI-Mania ‘18

Name Document Revision Date Comments

Fig. 1. Machine learning phase

Fig. 2. Deep learning phase

4. Approach to the Solution

Problems with R-CNN

proposals per image.

Fig. 3. Fast R-CNN

Fig. 4.Comparison of object detection algorithms

Fig. 5. Faster R-CNN

In this work, the entire solution is divided into three stages:

6. Technology Stack & Architecture (Software and Hardware)

7. Experimentations & Results

CNN Based Predictor

9. Future Scope of Solution

11. Appendix (as applicable)

You might also like