You are on page 1of 3

Title 2022-2023

CHAPTER – 2
ARCHITECTURE AND DESIGN

Dept. of ISE, DSCE 1


Title 2022-2023

CHAPTER 2
ARCHITECTURE AND DESIGN

This chapter gives a brief overview of the design theory and concepts that have been
made use in the project.

2.1 PROJECT ARCHITECTURE


The main architecture for Live Criminal Activity Detection through CCTV feed is based on the
encoder-decoder model.

A video consists of an ordered sequence of frames. Each frame contains spatial information,
and the sequence of those frames contains temporal information. To model both of these
aspects, we use a hybrid architecture that consists of convolutions (for spatial processing)
as well as recurrent layers (for temporal processing). The first model i.e., CNN will be used
to extract the (spatial) features and convert them into an encoded feature vector hence called
an encoder. Similarly the second model i.e., RNN will be used to process mini-batches of
encoded frames to get the final classification result hence called a decoder.

Fig 2.1: Architecture Diagram

Dept. of ISE, DSCE 2


Title 2022-2023

2.2 DESIGN DIAGRAMS

When it comes to real-time video processing, the data pipeline becomes more complex to
handle. And we are striving to minimize latency in streaming video. On the other hand, we
must also ensure sufficient accuracy of the implemented models.

The first part of this process i.e., preprocessing and encoding frames is a serial process which
will be done on each frame coming sequentially from CCTV.

The second part i.e., decoding a batch of embeddings to get predictions as probabilities of
different classes will be done once the required number of frames is available.

To make the best use of the hardware resources as well as to decrease latency and lags
while monitoring the live feed, we can use a pipeline approach which is aimed to split and
parallelize the operations, which are performed during the processing.

Fig 2.2: Pipeline Approach

Dept. of ISE, DSCE 3

You might also like