You are on page 1of 4

2020 International Conference on Information Science and Communication Technology

Identification of Human & various objects through


Image Processing based system
Qadir Bux Rind Ghulam Ali Mallah
Department of Computer Science Department of Computer Science
Shah Abdul Latif University Khairpur Pakistan Shah Abdul Latif University, Khairpur, Pakistan
Qadirrind@yahoo.com
Ghulam.ali@salu.edu.pk

Imran Memon
Noor Ahmed Shaikh
Department of Computer Science
Department of Computer Science
Bahria University Karachi Campus, Karachi Pakistan Shah Abdul Latif University Khairpur Pakistan
Imranmemon.bukc@bahria.edu.pk Noor.shaikh@salu.edu.pk

Abstract— The objective of this paper is to image processing Feature and uses of specific criteria for identifying the object
based system. This paper identification of humans through the that would be interacted with the human and classification of
image proceeding development technique by an image-based extracted feature to identify.
database system. The techniques involve using of identification of
human object monitoring detection development IDEs and Application of our propose method is as under.
adequate APIs to have desired functionalities. We monitoring the 1. Object monitoring for detection
human activities for deflection through the surveillance system
and recognition through the image process system. We proposed 2. Surveillance system
machine learning algorithms such as SVM and Naïve bayed to
classification various activities recognition. We implemented this 3. Human computer interface
technique using Matlab programming. 4. Monitoring the suspicious activity of human
Keywords— image processing, identification, human activity 5. Recognize the activity
6. Identify the simple action of multiple persons.

I. INTRODUCTION Our proposed system will propose the machine learning


algorithm utilizing for classifying various activities such as
Identification of human activities means the detection of the Naive Bayes and SVM algorithm
human activities captured through CCTV cameras or captured
video of human activities, the moment of the body, organs 1.1 Problem statement
which would be performed by human intentions that comes We are using the video camera / CCTV for capturing the
through gestures of the body. Activities of human-like running, real-time video of human activities. The problem is to identify
walking, eating etc. The activities would be recorded in the the activities of humans. These activities are as in-stream but
database our system will detect, which are performed by there is no binary format so this should be digitalized as an
individual image can be processed.
.so we will use such activities image process algorithms that
will match the actions of recorded video, once the matching Contribution as following
process would have occurred it will identify the activities of
1. Capturing the videos of human activities.
human from the real-time video. Propose of our project is to
identify human activities and store the activities in real 2. Recognizing human activities.
movement than analyze these activities. Our proposed system
will deduct human activities using a depth video system. Our 3. Study of the human activity detection method.
method will determine the size of an object which is interaction 4. In unimodal activity method which deficit’s human
with a human. activities from a data of single modality such as images which
Most of the people they involved in suspicious activities and
increase in the number of any social activities in this regard
security is concerned most of the people they have installed
CCTV camera for constant monitoring and watching the
interaction for monitoring and surveillance purpose. The II. RELATED WORK
captured video should be analyzed once deduct human activity The one of most of interesting field is Human face
that why system is required to digitize video for identifying the recognition to the computer and its various applications. The
activity of humans By extracting the actual feature from the widely used and several system application associate with Face
images and depth images that provide human in front of detection algorithms for example human computer inference,
face recognition and various video retrieving, image database

978-1-7281-6899-9/20//$31.00 ©2020 IEEE

Authorized licensed use limited to: UNIVERSITY OF BIRMINGHAM. Downloaded on July 22,2020 at 19:32:52 UTC from IEEE Xplore. Restrictions apply.
2020 International Conference on Information Science and Communication Technology

management and security control. Be that as it may, it is Speedier R-CNN has realized end-to-end optimization.
troublesome to create a total strong confront finder due to Abstract: From still pictures, human location is a challenging
different light conditions, confront sizes, confront and critical assignment for computer vision-based analysis. By
introductions, foundation and skin colors. In this paper, we recognizing Human insights vehicles can control themselves or
propose a confront location strategy for color pictures. Our can educate the driver utilizing a few disturbing methods.
strategy recognizes skin districts over the complete picture and Human location is one of the foremost critical parts of picture
after that produces confront candidates based on an associated handling. A computer framework is prepared by different
component examination. At long last, the confront candidates pictures and after making a comparison with the input picture
are separated into human confront and non-face pictures by an and the database previously-stored a machine can distinguish the
upgraded form of the format- coordinating strategy. The test human to be tried. This paper portrays an approach to distinguish
comes about illustrate fruitful confront location over the EE368 the diverse shapes of people utilizing picture preparation. This
preparing pictures. [1] proposal basically based on a shape-based location. The shape
of the input picture is extricated utilizing an administrator to be
specific can administrator. Distinctive pictures are utilized to
Human activity acknowledgment is an imperative method prepare up the framework. At that point after preparing the
and has drawn the consideration of numerous analysts due to its framework with the input picture when a test picture is given to
shifting applications such as security frameworks, restorative distinguish, the test picture is at that point compared with the
frameworks, amusement. Activity acknowledgment is a database. In the event that certain edge esteem is found at that
curiously and challenging point of computer vision inquire point, the test picture is considered as a particular human. The
about due to its planned utilization in proactive computing. The normal exactness and accuracy rate accomplished by the
created calculation for the human activity acknowledgment framework is over 93% [4].
framework, which employments the two-dimensional discrete
cosine change (2D-DCT) for picture compression and the self- 1. Prerequisites and components affecting the execution and
organizing map (SOM) neural arrange for acknowledgment characteristics of the protest of the biometric confront
reason, is reenacted in MATLAB. By utilizing 2D-DCT we acknowledgment framework are characterized. To begin with,
extricate picture vectors and these vectors gotten to be the input of all, it is the changeability of visual pictures, the plan of three-
to neural organize classifier, which employments self- dimensional objects, the number and area of light sources, the
organizing outline calculations to recognize basic activities color and concentration of radiation, shadows or reflections
from the pictures (prepared). In this paper, we have created and from encompassing objects. The arrangement to the issue of
outlined an acknowledgment framework for human activities identifying objects on the picture lies within the redress choice
employing a novel self-organizing outline based recovery of the depiction of objects, for the location and
framework. SOM has great include extricating property due to acknowledgment of which the framework is made. This choice
its topological arrangement. utilizing a picture database of 30 incorporates – the choice between 2D and 3D representation of
activity pictures, containing six subjects and each subject the scene and protest; – the choice between depicting objects as
having five pictures with distinctive body stances reflects that an entirety or as a framework; – the choice between the
the activity acknowledgment rate utilizing one of the neural frameworks of characteristics that depict the specifics of the
organize calculations SOM is 98.16% [2]. protest. 2. The highlights of classes and properties of human
confront acknowledgment issues are analyzed. For a lesson of
picture look assignments in expansive databases, one
arrangement is to store little sets of predefined key
Upgrading the strength of discovery was another broadly
characteristics within the database, as much as conceivable
examined point. One straightforward procedure was to combine
characterize the pictures. When altering the framework, it
different finders that had been prepared independently for
naturally understands the issues of get to control, to diminish
distinctive sees or postures [3] connected numerous deformable
the likelihood of inaccurate distinguishing proof, one can
portion models to capture faces with distinctive sees and
conceive the utilize of a few pictures having a place to one
expressions. The proposed a retrieval-based strategy combined
individual (with varieties), up to comparing the video
with discriminative learning. By the by, preparing and testing
arrangements of certain particular head developments and
such models were ordinarily more time-consuming, and the
confront facial muscles. The arrangement of the visa control
boost in discovery execution was generally constrained. As of
issue requires the utilize of acknowledgment strategies based on
late, the researcher built a show to perform confront location in
the distortion of one picture in arrange to turn it into another and
parallel with confront arrangement, and accomplished tall
to assess the «efforts» required for its execution. 3. A
execution in terms of both exactness and speed. later a long time
generalized calculation for programmed confront discovery and
has seen the propels of confronting location utilizing profound
acknowledgment is created. The displayed conspire of the
learning, which regularly outflanks conventional computer
generalized calculation comprises of nine straightforward steps
vision strategies altogether. For case, utilized CNN to
and takes into consideration the recognizable proof highlights
consequently learn and synthesize highlight extractors utilized
utilizing photo and video pictures. The advantage of the
for confront location. The displayed a strategy for recognizing
calculation is the straightforwardness of execution, it permits as
faces within the wild, which coordinating a Conv-Net and a 3D
of now at the plan arrange of the distinguishing proof
cruel confront demonstrate in an end-to-end multi-task
framework, to rapidly assess the system’s operability by
discriminative learning system. As of late, connected the
analyzing the inner interaction of its components [5].
Speedier R-CNN , one of the state-of-the-art bland question
locators, and accomplished promising comes about. In .
expansion, much work has been done to make strides the
Speedier R-CNN engineering. The joint preparation conducted
on CNN cascade, locale proposition organize (RPN) and

978-1-7281-6899-9/20//$31.00 ©2020 IEEE

Authorized licensed use limited to: UNIVERSITY OF BIRMINGHAM. Downloaded on July 22,2020 at 19:32:52 UTC from IEEE Xplore. Restrictions apply.
2020 International Conference on Information Science and Communication Technology

Within the development impact outline a chunk in which an


III. METHODOLOGY odd activity happens, other than with its adjoining squares,
has elite movement impact vectors. In expansion, as
DATA INPUT an activity is captured by various progressive outlines,
The video file is given as an input to the system, which is we extricate an include vector from a cuboid characterized by n
subjected to preprocessing. A video is treated as a stream of × n pieces over the larger part of the current number of frames.
images called frames and the frames are processed sequentially.
An RGB frame is first converted to grayscale. A grayscale
image consists of only the intensity information of the image
rather than the apparent colors. RGB vector is 3 dimensional (as
it consists of values of colors Red, Green, and Blue) whereas
the grayscale vector is one dimensional.
3.1 Creating Mega blocks.
Pre-Processing
outlines are apportioned into non-overlapping mega pieces,
Optical Flow after the pre-processing step, for each frame each of which could be a combination of numerous movement
in the video, optical-flow is computed for each pixel in an impact squares. The Movement Impact esteem of a Mega piece
outline utilizing the outline back-calculation. The optical stream is the summation of movement impact values of all the littler
is the design of clear movement of objects, surfaces, and edges squares constituting a bigger square.
in a visual scene caused by the relative movement between a
spectator and the scene. Optical-flow could be a vector of the 3.2 Extracting Features
form (r, ), where, r represents the magnitude of each pixel and After the recent’ number of frames are divided into Mega
 represents the direction in which each pixel has moved blocks, for each mega block, an 8 × t-dimensional concatenated
relative to the corresponding pixel in the previous frames [6-8]. feature vector is extracted across all the frames. For example,
Optical-Flow of blocks we take mega block (1,1) of all the frames ('t' number of frames)
and concatenate their feature vectors, to create a concatenated
Dividing a frame into blocks then compute each pixel in an feature vector for the block (1,1).
outline, partition the outline into m by n uniform squares
without any critical misfortune, whereas, each block can be IV. EVOLUTION AND DISCUSSION SECTION
indexed by {B1, B2, BMN}. Shows a frame of size 240 x 320
divided into 48 blocks where each block is of the size 20 x 20. 4.1 Testing phase
Calculating Optical-Flow of each block Now that we have generated the code words for normal
activities, it is time to test the generated model with a test
After dividing the frames into the block, compute the stream dataset that contains unusual activities.
of each square by computing the normal of optical-flows of all
the pixels constituting a square. The figure gives the equation 4.2 Minimum Distance Matrix.
for calculating the optical-flow of a square. Here, bi denotes an Within the testing stage, taking after extricating the
optical flow of the block, J is the number of pixels in a block spatiotemporal include vectors for all mega pieces, we construct
and denotes an optical flow of the pixel in the block. Optical- a littlest sum space network E over the mega squares, wherein
the flow of a block is a vector (r, ) which represents how much the esteem of a component is characterized by the least
each block has moved and in which direction compared to the Euclidean remove between an include vector of the current test
corresponding block in the previous frames. outline and the code words within the coordinating mega piece..
Motion Influence Map. 4.3 Frame level detection of unusual activities
The group direction of a perambulator inside a mob can be In a minimum-distance network, the littler the esteem of a
inclined by a range of factors i.e. Hind ranches beside the path, component, the less likely a bizarre action is to happen within
close by pedestrians, and movements of carts. We identify this the particular square. On the other hand, able to say that there
communication feature as an active influence. We assume that are abnormal exercises in t successive outlines in case better
the pieces beneath the impact to which a moving object can esteem exists within the minimum-distance network. Hence, we
impact are strong-minded by two variables: discover the most noteworthy esteem within the minimum-
distance lattice as the outline agent include esteem. On the off
chance that the most noteworthy esteem of the least separate
1. The movement way framework is bigger than the edge, we classify the current
outline as bizarre
2. The movement speed. The quicker an object move, the 4.4 Pixel level detection of unusual activities
more adjoining squares that are underneath the impact of Once a frame is detected as unusual, we compare the value
the question. Adjoining pieces have the next impact than of the minimum distance matrix of each mega block with the
far-away pieces. threshold value, If the value is larger than the threshold, we
classify that block as unusual. Shows an example of pixel-level
unusual activity detection [9-11].
Feature Extraction
Methodology for extracting the image features, the

978-1-7281-6899-9/20//$31.00 ©2020 IEEE

Authorized licensed use limited to: UNIVERSITY OF BIRMINGHAM. Downloaded on July 22,2020 at 19:32:52 UTC from IEEE Xplore. Restrictions apply.
2020 International Conference on Information Science and Communication Technology

following methods are used. tumor identification based on deep learning and in-vivo hyperspectral
images of human patients." Medical Imaging 2019: Image-Guided
1- Histogram of oriented gradients Procedures, Robotic Interventions, and Modeling. Vol. 10951.
International Society for Optics and Photonics, 2019..
(HOG) 2- Local Binary pattern [8] Dankwa-Mullan, Irene, et al. "Transforming diabetes care through
3- Bag of feature methods artificial intelligence: the future is here." Population health management
22.3 (2019): 229-242..Hou, Wei. "Identification of coal and gangue by
HISTOGRAM OF ORIENTED GRADIENTS(HOG) are feed-forward neural network based on data analysis." International
Journal of Coal Preparation and Utilization 39.1 (2019): 33-43..
being utilized for description processing extractor in image
[9] Anderson, Noel W., et al. "Human presence detection on a mobile
processing for detecting the objects in an image. HOG machine." U.S. Patent Application No. 10/183,667..
method supports counting the occurrence of oriented [10] Ramos, Beatriz, et al. "A new method to geometrically represent bite
gradients in the local position of the image. It will be the marks in human skin for comparison with the suspected dentition."
same that it will be compared to the edge orientation Australian Journal of Forensic Sciences 51.2 (2019): 220-230..
histogram. [11] Lakshmanaprabu, S. K., et al. "Optimal deep learning model for
classification of lung cancer on CT images." Future Generation Computer
The primary way for calculating the feature detector in Systems 92 (2019): 374-382..
image processing that ensures normalization of color and [12] Memon, I., Chen, L., Majid, A., Lv, M., Hussain, I., & Chen, G. (2015).
Gamma Values[13-22]. Travel recommendation using geo-tagged photos in social media for
tourist. Wireless Personal Communications, 80(4), 1347-1362.
LOCAL BINARY PATTERN [13] Memon, M. H., Khan, A., Li, J. P., Shaikh, R. A., Memon, I., & Deep, S.
(2014, December). Content based image retrieval based on geo-location
It is utilized for the classification of objects. LBP driven image tagging on the social web. In 2014 11th International
centers point to point which close the center point and looks at Computer Conference on Wavelet Actiev Media Technology and
the focuses within the environment or more noteworthy than or Information Processing (ICCWAMTIP) (pp. 280-283). IEEE.
less than the center point, resultant values shape an picture as [14] Arain, Q. A., Memon, H., Memon, I., Memon, M. H., Shaikh, R. A., &
gotten parallel yield. After calculating the pixels the journalist Mangi, F. A. (2017). Intelligent travel information platform based on
location base services to predict user travel behavior from user-generated
pixels area is upgraded in LBP nearby veil. In case neighboring GPS traces. International Journal of Computers and Applications, 39(3),
pixels esteem more noteworthy than or break even with that 155-168.
return pixels esteem the journalist bit is set to one in twofold [15] Memon, M. H., Shaikh, R. A., Li, J. P., Khan, A., Memon, I., & Deep, S.
cluster else bit esteem 0.when LBP veil calculated the LBP (2014, December). Unsupervised feature approach for content based
histogram is calculated[12]. image retrieval using principal component analysis. In 2014 11th
International Computer Conference on Wavelet Actiev Media
BAG OF FEATURE METHODS Technology and Information Processing (ICCWAMTIP) (pp. 271-275).
IEEE.
It may be a sack of visual words a recognizable strategy used [16] Shaikh, R. A., Deep, S., Li, J. P., Kumar, K., Khan, A., & Memon, I.
for picture classification; it organizes pictures within the shape (2014, December). Contemporary integration of content based image
of word courses of action and at last classifies pictures based on retrieval. In 2014 11th International Computer Conference on Wavelet
the histogram. Handle of the stream of pack of visual words for Actiev Media Technology and Information Processing (ICCWAMTIP)
(pp. 301-304). IEEE.
controlling the pictures comprising of the taking after focuses.
[17] Memon, M. H., Li, J. P., Memon, I., & Arain, Q. A. (2017). GEO
1. Extraction of local features form the image matching regions: multiple regions of interests using content based image
retrieval based on relative locations. Multimedia Tools and Applications,
2. Encoding the extracted features to make visual words 76(14), 15377-15411.
[18] Memon, M. H., Li, J., Memon, I., Arain, Q. A., & Memon, M. H. (2017,
3. Performing the special binning December). Region based localized matching image retrieval system
using color-size features for image retrieval. In 2017 14th International
4. Image classification method Computer Conference on Wavelet Active Media Technology and
Information Processing (ICCWAMTIP) (pp. 211-215). IEEE.
[19] Memon, M. H., Memon, I., Li, J. P., & Arain, Q. A. (2019). IMRBS:
image matching for location determination through a region-based
similarity technique for CBIR. International Journal of Computers and
REFERENCES Applications, 41(6), 449-462.
[20] Shaikh, Riaz Ahmed, Kamelsh Kumar, Rafaqat Hussain Arain,
[1] Bertran, Ana, Huanzhou Yu, and Paolo Sacchetto. 2002. “Face Detection Hidayatullah Shaikh, Imran Memon, and Safdar Ali Shah. "Multiple Trips
Project Report.” : 1–20.. Pattern Mining." DIFFERENCES 9, no. 5 (2018).
[2] Human Action Recognition Using Image Processing and Artificial Neural [21] Shaikh, R. A., Memon, I., Hussain, R., Maitlo, A., & Shaikh, H. (2018).
Networks.” 2013. 80(9): 31–34. A contemporary approach for object recognition based on spatial layout
and low level features’ integration. Multimedia Tools and Applications,
1-24.
[3] Chaitra B H, Anupama H S and N K Cauvery. Article: Human Action
Recognition using Image Processing and Artificial Neural Networks.
International Journal of Computer Applications 80(9):31-34, October
2013.
[4] Rahman, Ashikur, Computer Vision, Based Human, and Detection
International. 2017. “To Cite This Version: HAL Id: Hal-01571292
Computer Vision Based Human Detection.” 1(5): 62–85..
[5] Yaroslav, Korpan. 2017. “Analysis of Methods and Technologies of
Human Face Recognition.” (September)..
[6] Lishani, Ait O., et al. "Human gait recognition using GEI-based local
multi-scale feature descriptors." Multimedia Tools and Applications 78.5
(2019): 5715-5730..
[7] Fabelo, Himar, et al. "Surgical aid visualization system for glioblastoma

978-1-7281-6899-9/20//$31.00 ©2020 IEEE

Authorized licensed use limited to: UNIVERSITY OF BIRMINGHAM. Downloaded on July 22,2020 at 19:32:52 UTC from IEEE Xplore. Restrictions apply.

You might also like