You are on page 1of 6

PROJECT TITLE : CROWD COUNTING SYSTEM USING SWITCH CONVOLUTION

NEURAL NETWORK

SYNOPSIS

GROUP MEMBERS UNIVERSITY SEAT NUMBER

MANISH .K 1JB17CS072

MADHU .H 1JB18CS418

MADHAN REDDY 1JB17CS067

SRINIVAS GOWDA 1JB17CS059

GUIDE NAME PROJECT COORDINATOR NAME

PAVITHRA G S
PROJECT TITLE : CROWD COUNTING SYSTEM USING SWITCH CONVOLUTION
NEURAL NETWORK

Introduction :
Crowd Counting is a task to count people in image. It is mainly used in real-life for automated
public monitoring such as surveillance and traffic control. Different from object detection, Crowd
Counting aims at recognizing arbitrarily sized targets in various situations including sparse and cluttering
scenes at the same time.accurately estimating the number of objects in a single image is a challenging yet
meaningful task and has been applied in many applications such as urban planning and public safety. In
the various object counting tasks, crowd counting is particularly prominent due to its specific significance
to social security and development. Fortunately, the development of the techniques for crowd counting
can be generalized to other related fields such as vehicle counting and environment survey, if without
taking their characteristics into account. Therefore, many researchers are devoting to crowd counting, and
many excellent works of literature and works have spurted out. In these works, they must be helpful for
the development of crowd counting. However, the question we should consider is why they are effective
for this task.

Literature Survey :
Accurately estimating the number of objects in a single image is a challenging yet meaningful
task and has been applied in many applications such as urban planning and public safety. In the various
object counting tasks, crowd counting is particularly prominent due to its specific significance to social
security and development. Fortunately, the development of the techniques for crowd counting can be
generalized to other related fields such as vehicle counting and environment survey, if without taking
their characteristics into account. Therefore, many researchers are devoting to crowd counting, and many
excellent works of literature and works have spurted out. In these works, they must be helpful for the
development of crowd counting. However, the question we should consider is why they are effective for
this task. Limited by the cost of time and energy, we cannot analyze all the algorithms. In this paper, we
have surveyed over 220 works to comprehensively and systematically study the crowd counting models,
mainly CNN-based density map estimation methods. Finally, according to the evaluation metrics, we
select the top three performers on their crowd counting datasets and analyze their merits and drawbacks.
Through our analysis, we expect to make reasonable inference and prediction for the future development
of crowd counting, and meanwhile, it can also provide feasible solutions for the problem of object
counting in other fields.

Challenges :
challenges Counting by detection can be classified into three types, based on the features we use to
identify the crowd in images and videos.Monolithic Detection: It trains the classifier using the full-body
appearance that’s available in the training images using typical features such as Haar wavelets, gradient-
based features such as a histogram of oriented gradient (HOG), etc. Learning approaches such as SVMs,
random forests have been used that employ a sliding window approach. But these are limited to sparse
crowds. To deal with dense crowds, part-based detection is often more useful.Part-based detection: Rather
than taking the whole human body, this technique considers a part, say head or shoulders and applies a
classifier to it. Head solely isn’t sufficient in estimating the presence of a person reliably, therefore head +
shoulder is the preferred combination in this technique.Shape matching: Ellipses are
considered to draw boundaries around humans, and then a stochastic process is used to estimate the
number and shape configuration.The generation of the mask involves a nondifferentiable hard-
thresholding operation.

Motivation :
Traditional handcrafted crowd-counting techniques such as those inperform well if the
training dataset has a low computational cost. However, challenges like occlusion, clutter, and
scale variation reduce the accuracy of such traditional methods. In addition, the ED map obtained
by employing these handcrafted methods has a low resolution that limits their applicability in
many areas, such as medical imaging and military applications. In short, the manual nature of
feature extraction by handcrafted methods makes them less (non)adaptive to evolving crowd-
counting demands. By observing the above-mentioned deficiencies in traditional crowd-counting
algorithms, and the success of CNNs in numerous computer-vision applications, researchers
were inspired to exploit their ability in estimating the nonlinear feature density maps of crowd
images These density maps can be utilized in machine-learning processes for more accurate
prediction/estimation of the crowd count Further, up- and downsampling, scale aggregation, and
pre classification with a multicolumn approach.

Problem statement :
A Crowd is a gathering of numbers of people at some place. It is not feasible to
count/monitor all the people at various places like university, shopping malls, railway stations,
airports or at any other place by looking at them. The complexity of monitoring, tracking and
counting increases as the size of the crowd increases. We can’t monitor the crowd for suspicious
behavior as well. However, with the introduction of Closed Circuit Television (CCTV) cameras
this problem has been solved up to some extent. But still we are not able to track/monitor a large
group of people with CCTV. In recent years numerous applications of computer vision have
come. Researchers are trying to monitor, and count the crowd automatically with the help of
machine intelligence. It can be significantly advantageous if we can detect the objects from the
videos/cameras. From a technological perspective, detecting, tracking, and analyzing peoples
like detecting and tracking a person walking in a university, or identifying the communication
between two people, computer vision solutions typically focused on these areas. Generally, there
are number of CCTV cameras installed at various places to record the environment. But, the
tough task is that it is not feasible for human operator to sit all the time in front of the CCTV and
monitor/count the crowd. There is a need of an automated system which can provide us with
some meaningful information from live or recorded videos. Detecting objects from CCTV
surveillance videos can solve many real-life problems. If we can count/monitor the crowd then
we can have the valuable information about the objects within the videos. Knowing how many
people are coming in and going out in the premises can help us to draw valuable insights. One
common challenge for any CNN based crowd counting and monitoring is to meet the real-time
processing requirements where the Deep Learning model should run on embedded devices with
limited processing power and energy. Another challenge in crowd counting is the occlusion,
preserving the object across multiple frames when they overlap with each other. In the proposed
work the Deep Learning based methods.

Objectives :
Besides surveillance, crowd scenes also exist in movies, TV shows, personal video
collections, and also videos shared through social media. Since crowd scenes have a large
number of people accumulated with frequent and heavy occlusions, many existing technologies
of detection, tracking, and activity recognition, which are only applicable to sparse scenes, do not
work well in crowded scenes. Therefore a lot of new research works, especially targeting crowd
scenes, have been done in the past years. They cover a broad range of topics, including crowd
segmentation and detection.

Methodology :
imagery based crowd analysis for population profiling and density estimation in public
spaces can be a highly effective tool for establishing global situational awareness. Different
strategies such as counting by detection and counting by clustering have been proposed, and
more recently counting by regression has also gained considerable interest due to its feasibility in
handling relatively more crowded environments. However, the scenarios studied by existing
regression-based techniques are rather diverse in terms of both evaluation data and experimental
settings. It can be difficult to compare them in order to draw general conclusions on their
effectiveness. In addition, contributions of individual components in the processing pipeline such
as feature extraction and perspective normalisation remain unclear and less well studied. This
study describes and compares the state-of-the-art methods for video imagery based crowd
counting, and provides a systematic evaluation of different methods using the same protocol.
Moreover, we evaluate critically each processing component to identify potential bottlenecks
encountered by existing techniques. Extensive evaluation is conducted on three public scene
datasets, including a new shopping centre environment with labelled ground truth for validation.
Our study reveals new insights into solving the problem of crowd analysis for population
profiling and density estimation, and considers open questions for future studies.
Possible Outcome :Crowd counting and Analysis have a plethora of real-world applications such
as planning emergency evacuations in case of fire outbreaks, calamitous events, etc. and making
informed decisions on the basis of the number of people such as water, food planning, detecting
congestion etc.

Possible Outcomes :
In recent years, research on crowd counting is given importance as it helps in detecting
people misbehaving in video sequences. This paper presented a study of various crowd counting
techniques. Counting crowds is difficult due to reasons such as illumination changes in each
image scene. The results obtained using an implemented HOG/SVM based crowd counting
method is also given in this paper. Mall dataset is considered for evaluating the implemented
method.

References :
[1] Tuzel, Oncel, Fatih Porikli, and Peter Meer. "Pedestrian detection via classification
on riemannian manifolds." IEEE transactions on pattern analysis and machine intelligence 30,
no. 10 (2008): 1713-1727.
[2] Viola, Paul, and Michael J. Jones. "Robust real-time face detection." International journal of
computer vision 57, no. 2 (2004): 137-154.
[3] Dalal, Navneet, and Bill Triggs. "Histograms of oriented gradients for human detection." In
international Conference on computer vision & Pattern Recognition (CVPR'05), vol. 1, pp. 886-
893. IEEE Computer Society, 2005.
[4] Wu, Bo, and Ramakant Nevatia. "Detection of multiple, partially occluded humans in a
single image by bayesian combination of edgelet part detectors." In Tenth IEEE International
Conference on Computer Vision (ICCV'05) Volume 1, vol. 1, pp. 90-97. IEEE, 2005.
[5] Sabz Meydani, Payam, and Greg Mori. "Detecting pedestrians by learning shapelet features."
In 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-8. IEEE, 2007.
[6] Felzenszwalb, Pedro F., Ross B. Girshick, David McAllester, and Deva Ramanan. "Object
detection with discriminatively trained part-based models." IEEE transactions on pattern analysis
and machine intelligence 32, no. 9 (2010): 1627-1645.

You might also like