Professional Documents
Culture Documents
CHAPTER 1
INTRODUCTION
the over multiple frames are observed. A holistic approach that extract
behavioral characteristics of the object in raw video and statistical pattern was
recognized using hidden Markov models. The limitations of object based method
was overcome by holistic approach. The behavioral analysis model involves the
understanding of entire scene in video rather focusing on individual object and its
activity. The approaches suffers from increase rate of false positives.
crowd density estimation, tracking the crowd population and crowd behavior
analysis.
Haar wavelet
three or four memory spaces are only required to calculate the sum of
intensities of selected rectangular region of any size shown in figure 1.2.
The feature template can be set to any sized sub window. Once the
template forms are identifies, the number of features can be estimated by the
size the rectangle templates and size of training sample images.
point of PCA the projection on region of interest for detection can be obtained.
Pedestrian detection can be done by integrating PCA with HOG for improved
performance. PCA is found sensitive to the relative scaling of the original
variables.
the observed data. ICA is used for pedestrian detection with ICA suffers from
a problems of over-complete ICA and under-complete ICA.
defining a hyper plane between the non-linear data points of the images.
Maximum the distance between the hyper plane and the data point, maximum
the accuracy of classification. SVM can also be used for selecting healthy
features from the available data points of the images.
1.5.2.3 XGBOOST
Machine learning algorithms can work better with small data, when
applied to large data it suffers from issues like under fitting, model complexity
and lack of resource optimization. To overcome the issues, deep
learning networks can be applied to big data for knowledge discovery,
knowledge-based prediction and knowledge application. Deep learning
enables the machine models to learn directly from images, video or text.
There exists different deep learning architectures which helped in achieving
the remarkable performance compared with other machine learning as the data
size increases. Some of widely used deep learning models are discussed in the
following section (Figure 1.6).
Deep Learning
Models
1.5.3.3 Auto-Encoders
network layers namely encoder and decoder. The input to encoder layer is the
functionality is to encode this
input to a latent representation space z. The Gaussian distribution is used for
encoding and the output is the mean and variance of Gaussian distribution.
The filter performs dot operation on the pixels values with defined
weight at the filter and summed up into one value representing the all pixels
given to the filter. Thus, the convolutional layer generates the smaller matrix
of data points in image than its original size. The matrix is given to activation
layer provides non linearity and trains that network through back propagation.
Pooling layer down samples and reduces the size of matrix further that is
produced by filter. Pooling layer selects the one feature out of each group, thus
called max layer. The connected layer takes the output of the max layers and
produces the list of probabilities for different possible labels attached to the
given image. The classification decision is based on the highest probability of
the label.
1.8 SUMMARY