You are on page 1of 20

PEOPLE COUNTING AND

INDIVIDUAL SETECTION
TEAM MEMBERS
AJAY KRISHNAN 20BEC1355
JANAKI MANOJ 20BEC1213
OBJECTIVE

The two main objective of this paper is to


• Estimating the number of people
• Detecting/ tracking human being with low-resolution images and complex background.
• First grey scale images are generated.
• Adaptive background estimation method used for background estimation.
• Then background subtraction is done from the original image to get the foreground image.
• Figure 1 represents the image after background substraction
• Perspective correction is deployed to get objects at different location to get to same scale.


• Δx(y) = horizontal (vertical) scale of an object at y
• Δxref = horizontal (vertical) reference scale.
• q(y) is the ratio for different locations
• Yref= reference location
• Perspective correction method yv=the line where extension of parelle line
meet(vanishing point)
• Computation of total number of foreground pixels
• imgY = height of the processing image.
• N(y) =number of foreground pixels in the yth row
• Npixel = total no of foreground pixels.
• q(y) = ratio for different locations
• Estimating the relation between total foreground pixels and no of people by using neural network

• METHOD 1
• By directing giving the foreground as input and finding the relation with total number of people .

M =f(x)
Where M =total no of people
X = no of foreground pixels

• METHOD 2
• Based on Closed Foreground Pixels
Solid foreground blob represent moving people
Scattered pixels represent stationary crowd
In order to bring uniformity areas located with people are covered by white pixels and others by
black pixels. And these are given as iput to the neural network and relationship is found out.
M = F2(c)
M= total no of individual
c = total no of closed foreground pixels
• METHOD 3
• Both foreground pixels and foreground pixels after closing operation is given to the neural
network and the relation is found out.
• M = f3(c,x)
• It is seen that the foreground pixels after doing closed operation is used for further process.
HUMAN TRACKING/DETECTION

• Only segmenting foreground blobs would work as the resolution of the image is low.
• For feature detection

Kanade-Lucas-Tomasi (KLT) algorithm is employed for tracking and detecting corners.


KLT features have been used due to KLT’s good performance for tracking in subsequent video processing
First a small window from the image is taken and its derivative with respect to x and y
Which is represented by the gradient matrix
g(x,y)=[∂I/∂x, ∂I/∂y] ^T
where I is the intensity
• Two parameters has to be set in KLT algorithm
• 1. The number of features to be detected
• 2. The minimum distance between 2 feature points
• Then by using foreground mask obtained after the closing operation was used to filter out all
feature points from the background
• Now we have only feature points of human contours.
• Now clustering is used for human detection by using this feature points
• Before clustering we need a cluster model
• Here an ellipse is used as an cluster model.
• In this model, a vertical ellipse with semi-major axis, eh, and semi-minor axis, ew, is used to
represent a prior human shape.

• 2 ∗ eh and 2 ∗ ew are the average height and width of a person.


• KLT features are assumed to be uniformly distributed over the entire human body
• EM clustering model is used for clustering
• No of clusters is assigned as the no of human we have detected in the early step.
• It has 2 steps E step and M step
• Each ellipse containing the feature points represent an individual.
• Initialization:
• Suppose the estimated number of people from the neural network-based method
is L. Then, the number of clusters is initialized as 2*L. Each cluster has an equal
prior probability.
• E-step:
• The objective of the E-step is to obtain the assignment probabilities, which
associate the feature points with each cluster. The probability that the feature
point i generated by the j cluster is p(ji) and can be obtained with (10). In (10), h
is the cluster model, and h(ij) is derived from (8) using the parameters of the th
cluster. p, is the probability of each cluster, and k is the total number of clusters.
• M-step: The objective of the M-step is to maximize the likelihood with respect to the cluster
model parameters. The parameters to be updated in our algorithm include the location û, and the
probability p, of each cluster. n is the total number of feature points, and s, is the location of
feature point i in (11).
POST PROCESSING
After the EM clustering step, some postprocessing operations need to be performed
INPUT USED

• The four hour video was taken at 10 fps, and the image resolution is 640 ∗ 480 pixels. One image scene
every 100 seconds was used for the evaluation
• total of 153 images were extracted from the original four-hour video.
• In the set of images, the number of people in the scene ranges from 36 to 222.
• The training set consists of 102 images, which were formed by taking the first two images out of every
three consecutive images
• The test set is composed of the remaining 51 images.
• To increase the speed of people counting, all the images were resized to 320 ∗ 240 pixels
RESULTS

• It was seen that the accuracy increased after doing closing operation
FUTRE IMPROVEMENTS

• Texture inside the foreground region can be used as another input for the neural network
• Combination of foreground pixels and feature point clustering method feature point can be used
to get more feature points of human being.
• Higher resolution camera can be used for accurately detecting human and non human objects.

You might also like