You are on page 1of 6

Unusual Crowd Activity Detection using OpenCV and Motion Influence

Map
Arun Kumar Jhapate Sunil Malviya Monika Jhapate
Department of Computer Science Department of Computer Science Department of Computer Science
& Engineering, & Engineering, & Engineering,
Sagar Institute of Research & Kamla Nehru Institute of Sagar Institute of Research &
Technology, Bhopal, India Technology, Sultanpur, India Technology, Bhopal, India
arunjhapate11@gmail.com sunilmalviya0104@gmail.com monikajhapate24@gmail.com

Abstract— Suspicious behavior is dangerous in public conduct acknowledgment, area estimation and area based
areas that may cause heavy causalities. There are various administrations[1].
systems developed on the basis of video frame acquisition
where motion or pedestrian detection occur but those
systems are not intelligent enough to identify the unusual
activities even at real time. It is required to recognized
scamper situation at real time from video surveillance for
quick and immediate management before any casualties.
Proposed system focuses on recognizing suspicious
activities and target to achieve a technique which is able to
detect suspicious activity automatically using computer
vision. Here system uses OpenCV library for classifying
different kind of actions at real time. The motion influence
map has been used to represent the motion analysis that
frequently changes the position from one place to another.
System uses pixel level presentation for making it easy to
understand or identify the actual situation.

Keywords- Unusual Activity Detection, Action


Recognition, Motion Influence Map, OpenCV, Crowd based
Activity Detection. Figure 1.Group Activity Detection [2]

I. INTRODUCTION Fig.1 represents group activities that can be traced


either through physical sensor networks or computer
Recognizing human activities is a great work that is vision. Sensors are not very much capable to be precise
moving forward in the connected era. Sensors and with the action recognized in group, rather than that
wearable processing commonly accessible, likewise computer vision is an effective approach for doing the
called Internet of Things (IoT). It is at the center of same.
assistive innovations to give this information what is the Movement impact map is a movement portrayal
action when clients attempt to comprehend their conduct. procedure through which vitality is separated at that
Utilizing unlisted information, specialists and countless point movement impact map developed from those
clients can profit by increasingly astute, with more energies. Movement impact map can arrange the
information on action grouping and comprehend the movement impact highlights and contrasts for
machines around them. There has been substantial recognizing uncommon human action location.
research in the field of Human Activity Recognition
(HAR), study to distinguish between normal activities in
daily life (walking, running, sitting, standing etc.). The
account has been suggested for a wide variety of
functions and activities with an algorithmically more
complex structure. It has been analyzed, which occurs in
irregular intervals and potentially happens while doing
other activities. The motivation behind action
acknowledgment is to recognize the undertakings and
objectives of at least one operators from a progression of
remarks on the elements of specialists and natural
conditions. Since the 1980s, this exploration region has
pulled in the consideration of numerous software
engineering networks, since it gives individual help to a
wide range of uses and a wide range of territories like
restorative, human-PC correspondence, or humanism
Figure. 2. Usual Motion Influence Map [3]
because of its various nature, diverse region action
references can be alluded to as plan acknowledgment, As in the Fig. 2, motion is normal or usual and there
objective acknowledgment, aim acknowledgment, are no frequent changes over the frames in the particular

Authorized licensed use limited to: Cornell University Library. Downloaded on August 22,2020 at 13:47:08 UTC from IEEE Xplore. Restrictions apply.
interval of time. It can be observed that there is no time moving video, exhaustive writing overview of the
influence over the map, if frequent changes made then it ongoing works done by various writers is being given
will represent the influence on map as shown in Fig.3 during this energizing and application disapproved of
handy examination field. Truth be told, the study/audit
paper is finished by U.S. as this can be the spot to start
for our investigation deal with "location systems of
different human pursue and activity acknowledgment in
an undeniable time moving video observation".
Jiahao Li et al. [7] proposed a framework which
depends on pyramid vitality map as highlight descriptor
for a grouping of casings, it can spare and present the
activity history that spatially contrasts and the activities
perceived. It depends on bidirectional neural system
which can back track the concealed layers and present
the most pertinent outcomes. It is likewise powerful for
single objective or skeleton however mistakes for
different targets.
Nour El Din Elmadany et al. [8] proposed a
framework which depends on Biset Globality Locality
Figure 3. Unusual Motion Influence Map [3]
Preserving Canonical Correlation Analysis, which means
II. RELATED WORKS to get familiar with the normal component subspace
between two sets. The subsequent strategy is Multiset
A. Literature Survey Globality Locality Preserving Canonical Correlation
Suspicious exercises on open zones and individual Analysis, which expects to manage at least three sets. It
security are in genuine threat. In open territories, a huge make arrangements of skeletons as informational
number of video reconnaissance frameworks are utilized, indexes. The exactness for right acknowledgment rate is
for example, streets, detainment facilities, blessed 90.1%.
locales, air terminals and grocery stores. Video Soumalya Sen et al. [9] proposed a framework which
reconnaissance cameras are not astute enough to perceive depends on picture parsing procedure. Picture parsing
irregular exercises even at ongoing. It is important to relates various kinds of activities which are performed by
screen the recognition of suspicious exercises and to human that can be perceived in grouping of casings.
check the legitimacy of reconnaissance video. It is Activity arranges as – strolling, running, applauding,
required to perceived hurry circumstance at constant running, cycling, surfing, and so forth. It depends on
from video observation for fast and quick administration. frontal area and foundation connection through which
Zakia Hammal et al. [4] proposed a framework which framework improves the closer view item and stores
is Based on Conventional Neural Network that trains for these edges for future correlation. Picture parsing brings
human facial acknowledgment. Framework can be together picture division, object discovery or
prepared with various outward appearances and track acknowledgment.
exercises w.r.t. to sentenced articulations. The CNN-
based AU location uncovered a comparable change in III. PROBLEM IDENTIFICATION
discoveries concerning newborn child quality between
There are various researches have been done in the
assignments. The exactness rate for acknowledgment
field of human activity detection but fewer researches
right activity or articulation runs between 79 to 93 %.
found for unusual human activities detection at real time
He Xu et al. [5] proposed a framework which
using camera. Computer vision is a challenging approach
depends on RFID which is a physical sensor. The RFID
that can acquire real time human activity. Earlier
framework can be isolated into the accompanying three
proposed systems are capable to detect the actions
segments: Reader, Tag and Back-end PC framework.
especially in single user human activity category using
Can convey through the peruser and label recieving
skeleton recognition. This kind of action can only be
wires. The means of crafted by RFID framework are as
recognized in plain backgrounds not worth for non-plain
per the following: (1) The perusers send radio recurrence
backgrounds or outdoor scenes. Multi user action cannot
flag in the encompassing condition, and check whether
be recognized by this system, moreover it is difficult for
there is any tag; (2) When the tag in the peruser's
the system to recognize group or crowd based activities.
reception apparatus perusing range is enacted by its very
System is intended to recognize human activities from
own recieving wire to speak with the peruser and send its
crows to detect the unusual activities for getting prior
chip electronic code or other information; (3) RFID
notifications that helps to cure from heavy casualties.
Reader gets an electronic item code (EPC) or information
Skeleton tracking works with certain angle or distance,
sign of the tag by reception apparatus; Then the
not useful for far elevations that results inaccurate
information is decoded and prepared, and it will be sent
precision
to the back-end PC framework.
Suspicious exercises on open zones and individual
Varsha Shrirang Nanaware et al. [6] made an
security are in genuine peril. In open zones, a large
overview over different executed framework over
number of video reconnaissance frameworks are utilized,
activity acknowledgment. Various specialists have
for example, streets, penitentiaries, sacred locales, air
chipped away at location procedures of different human
terminals and general stores. Video reconnaissance
pursue and activity acknowledgment in an undeniable

Authorized licensed use limited to: Cornell University Library. Downloaded on August 22,2020 at 13:47:08 UTC from IEEE Xplore. Restrictions apply.
cameras are not canny enough to perceive uncommon influenced or not; it can only be confirmed by observing
exercises even at real time. The objective of the system is influence direction and distance, if it is it far from the
to recognize unusual human activity from crowd using current position then influence has to be detected. Figure
motion influence map and OpenCV for prior appraisal 6 show the flow chart of the system.
against crime in public places using camera. It will help
to build or install a system that can work 24x7 for real
time surveillance and decrease the crowd based criminal
activities and saves us from fatal results.

Figure 4. Skeleton Recognition for Single User Activity Detection [10]

IV. PROPOSED WORK & IMPLEMENTATION


Figure 5. Crowd Stampede Scene [11]
Proposed work is able to recognize human activity in
crowd and analyze whether the action is usual or
unusual. System purely debates with crowd based
activities that ensure situations. System uses OpenCV
library along with python IDE that deals with best
precision. System proposes motion influence map that
comprises for correct recognition rate. The proposed
framework is centered on the acknowledgment of
suspicious movement and is planned for finding a
technique that can identify suspicious action naturally by
utilizing PC vision strategies. Proposed system classifies
the differences among the frames using motion influence
map that represents the frequent changes in the frames in
a short interval of time. Recognizing unusual activity
from crowd is difficult task especially for sensor
networks; computer vision is an effective approach that
can acquire real time human activities and later analyzes
for uncommon frames.
Let it be more precise through flow chart, first of all a
crowd video has to be input that contains usual as well as
unusual activities. Once the input made a frame selection Figure 6. Flow Chart
getting started that also validates total number of frames.
If current frame reached the last frame then process The execution stage includes the real appearance of
become end otherwise it will further proceed for unusual the thoughts communicated in the examination record
activity detection. Hj will be calculated that how much it and it has been created in the plan stage. Execution ought
influenced the map, that feature trace the feature vector. to be a perfect mapping of structure documentation in a
The feature which has been extracted is the influence reasonable programming language to get the necessary
density in motion influence map. It has been influenced last item. This area talks about key choices about the
the map as per the unusual density required to declare the determination of the stage, the language utilized and so
uncommon activity only if it is greater than the threshold forth. These choices are frequently impacted by
density. If it is greater than the threshold value then numerous elements, for example, the real condition
decision is declared as unusual activity otherwise no where the framework works, requires speed, security
unusual activity has been detected. Once the unusual concerns, different executions explicit prerequisites, and
activity confirmed; it will cluster the influence area and so forth and furthermore we examine quickly about
represent it over the frames and finally shows on pixel significant modules and strategies that are available in
level that can easily identify by user whether the unusual the venture. The code is divided into 5 modules,
activity has been performed or not. System is based on optoflobblocks, motion fluegenjener, creativemoblock,
Motion Influence Map and OpenCV libraries. Motion training and testing. In this section, a method has been
Influence Map extracts the motion features and later described in the crowded scene to identify unusual
clustered through K-Means Clustering. System clusters
activities and represent speed characteristics for
those frames which are having unusual activity that have
been defined in motion influence map. Motion is either localization.

Authorized licensed use limited to: Cornell University Library. Downloaded on August 22,2020 at 13:47:08 UTC from IEEE Xplore. Restrictions apply.
applied where i is position that belongs to K a set of
blocks. Compute threshold value Td equal to double
mod of bi an origin multiply with the block size S. There
are two directions of a frame from origin -Fi/2 and Fi/2,
so the angle is to be calculated as per the direction of
motion vector. It is required to calculate the Euclidean
distance between both positions- the origin and the
motion vector.
Once it has been calculated, then it will validate if it
is less than the threshold value then calculate the angle
between bi and bj, it is an angle between the origin and
motion vector. Then it is required to find out in which
direction it goes, if it is in satisfactory condition it will
finally calculate the motion influence weight with vector
position that later localize with pixel or frame level
presentation. But there are certain steps towards
localizations of motion influence.
Figure 7. Usual Activity Detection
In the movement impact map, a square containing a
A. Motion Influence Vector Algorithm strange action, with its neighboring squares, has one of a
Require: S ← block size, K ← a set of blocks in a kind movement impact vectors. Aside from this, since
frame, B ← motion vector set, M ← motion influence an action is caught by a few continuous casings, we
map, I ← moving object, D(i, j) ← Euclidean distance expel an element vector from Cubeid as characterized by
between object i and block j, Td is a threshold and φij ← the n × n obstructs on the latest T number of edges.
angle between a vector from object i to object j. Making megablock outlines are separated into non-
INPUT: B ← Motion Vector Set covering uber hinders, every one of which is a mix of
numerous speed sway squares. Motion Influence value
OUTPUT: H ← Motion Influence Map of a megablock is the sum of the speed effect values of
all small blocks that make up a large block.
Step 1: Hj (j ∈ K) is set to zero at the beginning of each
Extraction Features recently the number of 't' frames
frame
is divided into megablocks, for each megablock, an 8 ×
Step 2: for all i ∈ K do
t-dimensional short feature vector is drawn in all frames.
Td = ║ bi ║ × S; For example, we enclose the mega block ('T' number of
= ∠ bi + ; 't' of frames) and their feature vectors, to create a distinct
feature vector for the block (1,1). For each super square
= ∠ bi - ; grouping, we bunching utilizing Spatio-Temporal
highlights and setting the focuses as a codeword. This is
for all j ∈ K do
the explanation, (i, j) for uber square, we have K
if i ≠ j then
codeword, {w (i, j) k} k = 1. Here, we should take note
Calculate the Euclidean distance D(i, j) between bi and
of that in our preparation stage, Normal exercises
bj
utilizing cuts.
if D(i, j) < Td then
Accordingly, the codeways of uber squares make
Calculate the angle φij between bi and bj
examples of typical exercises that can be in the
If < φij < then particular zone. On account of least separation lattice
testing, in the wake of expelling spatio - grammatical
Hj (∠ bi) = Hj (∠ bi) + exp (− ) error highlight vectors for all super squares, we make
the base separation grid e on the uber hinder, in which
end if the estimation of one component between the trait is not
end if exactly Euclidean. The present test outline and related
end if super square are characterized by the vector code. The
end for edge level introduction of surprising exercises in the
end for base separation grid, the littler the estimation of a
Step 3: Hj with respect to ∠ bi is reflected motion component, the probability of having an abnormal action
influence map in related squares is less.
Indicate motion influence vector Vj Computer vision is now getting advanced using
Step 4: End
OpenCV along with python. Python IDE is a platform
B is the input as motion vector set and H is output as where several image processing concepts computed with
motion influence map which is required to examine. In high level of precision. The configuration is quite easy
step 1, set Hj to zero at the beginning of each frame, Hj to install in python where several packages available and
is motion influence at j block, where j belongs to K a set can be used from any part of the country. OpenCV is an
of blocks in a frame, In step 2, a ‘for’ condition has to be advanced computer vision libraries that helps to

Authorized licensed use limited to: Cornell University Library. Downloaded on August 22,2020 at 13:47:08 UTC from IEEE Xplore. Restrictions apply.
compute effectiveness in the field of image processing. Graph 1 shows the frame level results whether it
The motion influence map can be drawn easily in python contains unusual activity or not, result has been
and the pixel level presentation is often easier. Motion obtained on the basis of true positive, true negative,
influence requires some python packages to run actual yes, actual no, predicted yes, predicted no in the
effectively with high level of accuracy of true reference of total no of frames. On the basis of these
recognition. Python also makes it easy to coding and simulations, precision, recall and overall accuracy have
testing simultaneously by adopting the test driven been calculated.
development (TDD) approach. There is a huge library TABLE I. RESULT COMPARISON
for python as well as for OpenCV which allow users to
implement a system with fewer codes. Most of the smart Accuracy %
phone applications develop in python that interact
Zakia Hammal [1] 84
intellectually with human.
Jiahao Li [4] 95.97
Nour El Din Elmadany [5] 94.14
Aouaidjia Kamel [6] 94.51
Soumalya Sen [7] 88.70

GRAPH II. RESULT COMPARISON

100
90 Zakia [9]
80
70 Jiahao [12]
60 Nour [13]
50
40 Aouaidjia [1
30 Soumalya [1
20
10 Abouzar [19
0 Proposed
Figure 8. Frame Analysis in Python IDE
Accuracy
V. RESULT ANALYSIS
The result has been analyzed on the basis of true VI. CONCLUSION
positive, true negative, actual yes, actual no, predicted
yes and predicted no. There are 147 as total no. of The systems which have been proposed till now are
frames where 122 as true positive, it means that 122 intended to recognize simple human action such as
frames have unusual activities that have been detected walking, running and many more but not suitable for
positively and 20 true negative that represents true crowded area. System which has been proposed is able
rejection, system contains either usual or unusual but to recognize unusual human action from crowd and
system rejected the real existence. But actual yes is 132 action accordingly using motion influence map and
and actual no is 15. As per the prediction 136 is yes and OpenCV. The precision rate is bit higher than other and
11 is no. So, on the basis of that over all accuracy has less researches have been made over this concept.
been calculated as 96.59 %, precision as 89.70 % and Proposed system is able to work for Prior Appraisal
recall value as 92.42 %. against Crime. The accuracy is 96.42 % which is good
enough for recognizing unusual activity in complex
I. GRAPH I. RESULT ANALYSIS backgrounds. The proposed system is capable enough
to efficiently recognize the unusual human activity
from crowd by using OpenCV and Motion Influence
150 Map, which enhances the accuracy and proficiency of
130 TP the system up to a great extent. The Unusual Crowd
110 TN Activity Detection can be implemented in various
public places for prior and crime notification that
90 Actual Yes
enhances the casualty management. But accuracy is
70 Actual No often important which requires enhancing for
50 Predicted Yes developing an ideal system that can be implemented
practically.
30 Predicted No
10
Total Frames REFERENCES
Result
[1] D. Lee, H. Suk, S. Park and S. Lee, "Motion Influence Map for
Unusual Human Activity Detection and Localization in Crowded

Authorized licensed use limited to: Cornell University Library. Downloaded on August 22,2020 at 13:47:08 UTC from IEEE Xplore. Restrictions apply.
Scenes," in IEEE Transactions on Circuits and Systems for Video [22] M. Gowayyed, M. Torki, M. Hussein, and M. El-Saban,
Technology, vol. 25, no. 10, pp. 1612-1623, Oct. 2015. “Histogram of oriented displacements (hod): Describing
[2] Donghui Song, trajectories of human joints for action recognition,” in
"https://www.youtube.com/watch?v=y7Jo_9Z9z0U”, Published International Joint Conference on Artificial Intelligence
on Jun 13, 2018. (IJCAI), 2013.
[3] D. Lee, H. Suk, S. Park and S. Lee, "Motion Influence Map for
Unusual Human Activity Detection and Localization in Crowded
Scenes," in IEEE Transactions on Circuits and Systems for Video
Technology, vol. 25, no. 10, pp. 1612-1623, Oct. 2015.
[4] Zakia Hammal1, Wen-Sheng Chu1, Jeffrey F. Cohn1,2, Carrie
Heike3, and Matthew L. Speltz4, “Automatic Action Unit
Detection in Infants Using Convolutional Neural Network”, 2017
Seventh International Conference on Affective Computing and
Intelligent Interaction (ACII), IEEE.
[5] He Xu1,2, Chengcheng Yuan1, Peng Li1,2, Yizhuo Wang3,
“Design and Implementation of Action Recognition System
Based on RFID Sensor”, 2017 13th International Conference on
Natural Computation, Fuzzy Systems and Knowledge Discovery
(ICNC-FSKD 2017), IEEE.
[6] Mrs. Varsha Shrirang Nanaware, Dr. Mohan Harihar Nerkar and
Dr. C.M. Patil, “A Review of the Detection Methodologies of
Multiple Human Tracking & Action Recognition in a Real Time
Video Surveillance”, IEEE International Conference on Power,
Control, Signals and Instrumentation Engineering (ICPCSI-
2017).
[7] Jiahao Li†, Hejun Wu∗ and Xinrui Zhou‡, “PeMapNet: Action
Recognition from Depth Videos Using Pyramid Energy Maps on
Neural Networks”, 2017 International Conference on Tools with
Artificial Intelligence, IEEE.
[8] Nour El Din Elmadany, Yifeng He and Ling Guan, “Information
Fusion for Human Action Recognition via Biset/Multiset
Globality Locality Preserving Canonical Correlation Analysis”,
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018.
[9] Soumalya Sen, Moloy Dhar and Susrut Banerjee,
“Implementation of Human Action Recognition using Image
Parsing Techniques”, 2018 Emerging Trends in Electronic
Devices and Computational Techniques (EDCT), IEEE.
[10] Frontiersin,“https://www.frontiersin.org/files/Articles/160288/f
robt-02-00028-HTML/image_m/frobt-02-00028-g008.jpg”.
[11] Hqdefault, “https://hvss.eventsmart.com/events/intro-concealed-
carry/hqdefault/”.
[12] D. Aranki, G. Kurillo, P. Yan, D.M. Liebovitz, and R. Bajcsy,
“Continuous, real-time, tele-monitoring of patients with chronic
heart-failure: lessons learned from a pilot study,” in International
Conference on Body Area Networks (Bodynets), 2014.
[13] Abouzar Ghasemi and C.N. Ravi Kumar, “A Novel Algorithm
To Predict And Detect Suspicious Behaviors Of People At
Public Areas For Surveillave Cameras” Proceedings of the
International Conference on Intelligent Sustainable Systems
(ICISS 2017)
[14] Y. Nam, S. Rho, and J. Park, “G3d: A gaming action dataset
and real time action recognition evaluation framework,” IEEE
Conference on Computer Vision and Pattern Recognition
Workshops (CVPRW), 2012.
[15] Y. Nam, S. Rho, and J. Park, “Intelligent video surveillance
system: 3-tier context-aware surveillance system with
metadata,” Multimedia Tools and Applications, vol. 57, pp.
315–334, 2012.
[16] A. F. Bobick and J. W. Davis, “The recognition of human
movement using temporal templates,” IEEE Transactions on
Pattern Analysis and Machine Intelligence, vol. 23, pp. 257–
267, 2001.
[17] M. Blank, L. Gorelick, E. Shechtman, M. Irani, and R. Basri,
“Actions as space-time shapes,” 2005, pp. 1395–1402.
[18] I. Laptev and T. Lindeberg, “Local descriptors for spatio-
temporal recognition,” in Spatial Coherence for Visual Motion
Analysis, 2006.
[19] X. Yang, C. Zhang, and Y. Tian, “Recognizing actions using
depth motion maps-based histograms of oriented gradients,” in
ACM International Conference on Multimedia (MM), 2012.
[20] J. Wang, Z. Liu, J. Chorowski, Z. Chen, and Y.Wu, “Robust 3d
action recognition with random occupancy patterns,” in
European Conference on Computer Vision(ECCV), 2012.
[21] X. Yang and Y. Tian, “Eigenjoints-based action recognition
using navebayes- nearest-neighbor,” in IEEE Computer
Society Conference on Computer Vision and Pattern
Recognition Workshops, 2012.

Authorized licensed use limited to: Cornell University Library. Downloaded on August 22,2020 at 13:47:08 UTC from IEEE Xplore. Restrictions apply.

You might also like