You are on page 1of 8

1

An Efficient System for Crime Prediction using


Machine Learning
1.Tejas Chaudhari
2.Aditya Kharde
3.Aakanksha Surshetwar
4.Sacchidanand Shinde
Under guidance of Prof. Dipali P.Baviskar
Department of Computer Engineering
Maharashtra Institute of Technology,Pune

Abstract — Crime is one of unnecessary evils that lurks in Some costs of crime are less tangible (not simply or
our society. This is usually the orchestrated by people with exactly identified). These costs will be the cause of much pain
nefarious intentions and has a lasting negative impact on and suffering, and a lower quality of life. There are traumatic
people and is inhumane in nature. For the purpose of impacts on friends and the disruption of family. The behavior
combating this problem various law enforcement agencies of such individuals will always be forever modified and
work round the clock. This ensures that the residents are formed by crime, be it whether the risks of living in certain
safe and secure form these crimes, which also requires places or perhaps the worry of creating new friends.
significant amounts of money from the governments.
Therefore, the implementation of the artificial intelligence Crime not only affects economic productivity once
or the machine learning approach can provide significant victims start missing work, however, but communities are also
improvements to the efficiency. To this end various affected through loss of commercial enterprise and retail sales.
researches have been performed to enable crime Even the questionable victimless crimes of vice crime, drug
prediction but most of them have fallen short of their abuse, and gambling have major social consequences. Abuse
expectations. Thus, this publication outlines an effective often affects employee productivity and uses public funds for
and secure crime prediction system that utilizes K-Means drug treatment programs and medical attention, and the victim
clustering and Linear Regression along with the Fuzzy often ends up in criminal activity to support the expenses of a
Artificial Neural Networks and Decision tree. The drug habit.
experimental results conclude that the proposed system is
executing as intended. Communities and governments pay public funds for
police departments, prisons and jails, courts, and treatment
Keywords— Crime Scenario, Crime prediction, K-Means, programs, together with the salaries of prosecutors, judges,
Linear Regression, Fuzzy ANN, Decision tree. public defenders, social employees, security guards, and
probation officers. the amount of your time wasted by victims,
I. INTRODUCTION offenders, their families, and juries throughout court trials
conjointly deducts from community productivity. By the start
of the 21st century, it absolutely was calculable that the annual
The crime is a major part of each society. Crime has
value of crime within the U.S. was reaching upward toward
lasting effects on everyone who has come in contact with it to
$1.7 trillion.
some extent. The price paid for the crime and effects is widely
varied. Additionally, some effects of crime are short whereas
Crime is one of the largest and dominating drawbacks in
others last a lifespan. Of course, the biggest cost paid for
our society and its reduction is very important. task. Daily
crime is a loss of life. Different prices are paid by victims as
there are an immense increase in numbers of crimes
they embrace medical prices, property losses, and loss of
committed. This needs keeping track of all the crimes and
financial gain.
maintaining a piece of information for the same which can be
used for future reference. the present drawback in this
Losses to each victim and non-victims also range from
technique is the maintaining of the correct information set of a
increased expenses such as medical bills and security expenses
particular crime and analyzing this data to assist in predicting
together with stronger locks, additional lighting, parking in
and finding crimes in the future. the target of this technique is
additional high-priced secure plots, security alarms for homes
to investigate a dataset that contains varied crimes and
and cars, and maintaining guard dogs, etc. significant cash is
predicting which kind of crime can happen in the future
spent to avoid being misused. different sorts of expenses will
relying upon varied conditions. during this project, we are
have to be borne by a victim or person scared of crime moving
going to be utilizing the technique of machine learning and
to a brand-new neighborhood, observance expenses, legal
information science for the purpose of crime prediction of
fees, and loss of college days.
Chicago crime information set. The crime information is
 extracted from the official portal of the Chicago police. It
2

consists of crime info like location description, sort of crime, KNN has some benefits and drawbacks compared to
date, time, latitude, longitude. Before training the model Naïve Bayes. a bonus is that KNN's call boundary will take
information, preprocessing is going to be performed following any type, Naïve Bayes will solely have linear, elliptic, or
which feature selection and scaling are going to be done in parabolic call boundaries. additionally, Naive Bayes isn't
order to increase the accuracy. The K-Nearest Neighbor useful with correlative attributes, if the peculiarity of
(KNN) classification and various alternative algorithms are classification isn't the marginal distributions but relies on
going to be tested for crime prediction and one with higher correlation, then NB will not be an honest alternative. Naive
accuracy is going to be used for training. the visual image of Bayes can also be misled by the absence of an Associate in the
the dataset is going to be worn out terms of graphical Nursing attribute. one of the disadvantages is that KNN
illustration of the many cases as an example at which time of doesn't acknowledge the foremost vital attributes, space is that
the year the crime rates are high or at which month the the sole criteria used. in addition, it's non-parametric, and so
criminal activities are on the rise. The sole purpose of this not as explicable as NB, KNN cannot give any relationships
project is to allow a gist plan of how machine learning is often between the distribution of attributes and categories. KNN
employed by the enforcement agencies to observe, predict and doesn't handle the missing knowledge properly, Naïve Bayes
solve crimes at a lot quicker rate and therefore reduce the rate simply excludes the attribute of missing knowledge. In KNN,
of crime. It not restricted to just Chicago, this could be the value of K has to be tuned relative to clustering and the
employed in alternative states or countries relying upon the best value has to be allotted. Another disadvantage is that
provision of the dataset. KNN is slower in the process throughout prediction, with
massive amounts of knowledge the distinction in speed is
In this analysis, Python was trained for exploring critical.
criminal knowledge, creating multivariate analysis and
predicting classes for the purpose of taking a look at the Crime hot-spot location prediction is vital for public
inferred data, so as to identify the most effective correlation safety. The output from the prediction will give helpful info to
between the options (Date, Pd-District, Address, Day of boost the activities aimed toward police investigation and
Week, Description, Resolution, X and Y) and therefore the preventing safety and security issues. Location prediction may
target result (Category of Crime). All nominal values were be a special case of spatial data processing classification. for
converted into binary values by changing the values of the example, within the public safety domain, it should be
attributes into separate new attributes which provide the attention-grabbing to predict location(s) of crime hot spots.
values of either zero or 1. many trials of various Regression during this study, we tend to use a support vector machine
strategies were used on the coaching knowledge by (SVM)-based approach to predict the situation as an
segregating it into 2 sets; coaching and validation, each alternative to existing modeling approaches. The support
validation and cross-validation were conducted, the strategy vector machine forms the new generation of machine-learning
with the smallest amount of Log loss was applied to predict techniques that are used to realize the best disconnection
the results by taking a look at knowledge. 2 main Algorithms between categories at intervals datasets. we tend to compare
were employed in this analysis. the primary algorithmic the performance of 2 forms of SVMs techniques: two-class
program is K-nearest neighbors, KNN is a supervised learning SVMs and one-class SVMs. we tend to conjointly compare
algorithmic program for either classification or regression, SVM with a neural network-based approach and spatial auto-
combining 2 totally different distances-weighting functions. regression-based approach. Experiments on 2 totally different
the primary operation is uniform, all points in the spatial datasets demonstrate that the previous approach
neighborhood area are weighted equally. The second operation performs slightly higher and therefore the latter one provides
is inverse, weight points are alotted by the inverse of their cheap results. moreover, during this study, we offer a general
distance. during this operation, the nearest neighbors of an framework to customize the spatial knowledge classification
attribute can have a bigger influence than neighbors in that task for alternative spatial domains that perform on datasets
area unit. The second algorithmic program is Naive Bayes, like the analyzed crime datasets.
which is a set of supervised learning algorithms that supports
applying Bayes’ theorem with the “naive” assumption of Predictive models have many social utilities where they
independence between each of the options, by combining 3 are used to prevent suicides and crimes by analyzing past data
functions. the primary operation is Bernoulli, that implements that is available in various sources. Predictive analysis is
the naive Bayes coaching and classification algorithms for usually done to predict the outcome of certain incidents using
knowledge. it's helpful if your feature vectors are in binary historical data. Traditionally analysis is done by the user after
units. The second operation is multinomial, which implements he/she specifies what to look for in the data sets. But when
the naive Bayes algorithmic program for multinomial there are thousands of data items it gets more complicated to
distributed knowledge, it's usually used for distinct counts. look for certain numerical values or texts to predict certain
The third operate is mathematician, wherever the chance of outcomes. Machine Learning, when paired with these
the options is assumed to be mathematician, rather than prediction methods, the possibilities become infinite. Machine
distinct counts, we've got continuous options. In Python, the learning has various concepts that can be implemented into
Scikit-learn library functions were trained to conduct effect based on the user’s requirement. Deep Learning is a
regression and classification. concept that is used in many systems like Google’s search
engine and the predictive keyboards that everyone uses in their
3

phones every day. Deep testing data and the output is K.R Vineeth [5] introduces the factors which are
compared with the actual data. The output is visualized responsible for increasing the crime rate in India, such as
through a graph. Multiple graphs are drawn for multiple growing population and the limited job opportunity for youths
training process. The accuracy can be visualized between the which diverts them to commits crime due to stress. Crime
graphs. analysis is necessary and important which is helpful to
inverstigation agencies to take action to prevent them. In the
This paper dedicates section 2 for analysis of past work proposed paper they authors are used FP Max a bottom-up
as literature survey and section 3 describes the proposed approach to concentrate on frequent crime which uses linked
model in details. Section 4 elaborates about the experimental list for reducing the space complexity and CIP to classify the
setup and Result evaluation, whereas section 5 concludes this data with labels like high, low, and dangerous by using
paper along with the future scope expectations. random forest which yields a promising accuracy in the
prediction of the crime.
Z. Wawrzyniak [6] elaborates the prediction of crime
II. LITERATURE SURVEY events that will occur in the future is dependent on
observational data and other factors that are affecting crime.
This section of the literature survey eventually reveals Data can be fetched from the police records or from open
some facts based on thoughtful analysis of many authors work sources to form structured data that can be used to understand
as follows. the criminal behavior for predicting future crime events. The
model used deep leaning architecture of Artificial Neural
S.Yadav[1] discusses as the population is increasing Network to reach a good level prediction. For selection of
simultaneously the crime rate is increasing in India. In the hidden neurons, a new technique is developed a virtual leave-
proposed paper authors predict the crime from the previous one-out test (VLOO) and for selection of network inputs they
year’s record of crime such as murder, kidnapping and have used Gram-Schmidt orthogonalization (GS). Short-term
abduction, dacoity, robbery, burglary, rape, and other such crime prediction using the long short-term memory (LSTM)
crimes. The model is used Naive Bayes Algorithm which is recurrent neural networks (RNN) and convolutional neural
one of the finest data mining techniques which classify the networks (CNN) is used.
data in different predefined classes and sets. Correlation &
Regression is used to relate two variables with each other if M. Nakib [7] presents the new approach for predicting
correlation result is 1 then it is perfect relation if it is 0 then the crime from the crime scene using the blood, knife, and
there is no relation between two variables. So, the proposed Gun. As we know and see on a regular basis there has been a
model is help to predict the crime and reduce the crime. larger amount of CCTV cameras that have been installed to
A. Babakura [2] states that it is necessary to analyze the monitor a certain area but it is very hard to manage all
crime data using the data mining technique. In the present cameras manually. The Freighting item is extracted from the
paper they have compared two data mining techniques such as images which give the predication whether the crime is
Naïve Bayesian and Back Propagation for the crime committed or not and from where the image is taken.
prediction, then output is provided in three categories such as Detection is done on the basis Rectified Linear Unit (ReLU),
low, high and medium. They finally reveal that accuracy rate Convolutional Layer, Fully connected layer and dropout
of Naïve Bayesian is better than the Back Propagation. function of CNN. 90.2% accuracy is achieved on tested
S.Sivaranjani [3] estimates that the crime is one of the dataset.
important fields to be restricted in India which is a curse in
developing countries. In the proposed paper the data is C. Chauhan [8] discusses the crime rate is increasing day
extracted from the National Crime Records Bureau (NCRB) by day and it is one of the major topics to be researched by
of India, which is having information of six cities of 14 years using Artificial Intelligence and Machine learning techniques.
with 9 attributes. The system is used different clustering Thus, by using this technique and by the guidance of crime
methods such K-means, Density-Based Spatial Clustering etc. data analysts they can predict the crime and help the
to get the best clustering method for crime prediction. The investigation officers. To handle the huge amount of data
attribute input is given to KNN algorithm to analyze the large manually it is not possible as the criminals are becoming
data and the fed to k-means clustering too. Thus the system technically advanced therefore it is necessary to develop
predicts there is an improvement in results as compared to advanced technology to help the police officer ahead of them.
others. For classification of data they have used Naïve Bayes
N.Mahmud[4] narrates that crime prediction is one of the Classifiers technique.
emerging topic to be researched in recent years. By using the
crime pattern theory the crime can be predicted from the past Z. Beiji [9] estimates the new approach called a Neuro-
data. The model is introduced CRIMECAST which is crime fuzzy based model for evaluating the crime. Firstly data is
detection and strategy direction service, which attempts to collected from violent scene detection (VSD) for extracting
predict probable future crimes by simulating probabilistic the fuzzy rules. Videos can be collected from the various web
model implementation and Artificial Neural Network. As it is sites or some scenes of some Hollywood movie. Secondly
new model they have not compared with any other model. video analyzing is done, the video analyzing system has 3
modeled indicators such as 52 action concepts for e.g. punch,
4

slap, kick, fall, run with 15 scene concepts for e.g. crowd, they have used a new technique that accounts for external
street, park, residential, bush and 21 object concepts for e.g. influences by using ARIMAX – which transfers the single
gun, knife, fire, face mask, car. So for representation they have input that is in the motorcycle.
used Bag of concepts (BoC) and Co-occurrence of concepts
(CoC). Then the neuro Fuzzy based model is implemented for
computing crime.

H.ianzhong [10] explains that crime detection is one of


the important factors where it is also necessary to know the III PROPOSED MODEL OF PREDICTION
geographical profile of the crime from where it started to
where it is ended as an effective method to investigate. Three
reasons leading to the choices of serial crime scenes are, the
first criminal should know the crime place very well, second is
the place lacking acquaintances and the third is that the point
should be far away from the crime spot. Criminal Geographic
Targeting (CGT) developed by Darcy Kim Rossmo included
the improvement and popularization of this investigation
method.

A. Ghazvini [11] elaborates crime is not commonly


predictable by police as it is part of the society. Presently
police detect the crime event based on evidence, experience,
intuitive. Authors aim to develop a combined transfer function Figure 1: System Overview for the Crime Prediction
for better Nonlinear Autoregressive Time Series for
performance prediction with external input by using The Proposed model for crime prediction by using the
Levenberg-Marquardt (LM) and Scaled Conjugate Gradient machine learning is being depicted in the figure 1. The steps
(SCG) algorithms. Hyperbolic Tangent Sigmoid and Radial that are carried out to bring down the proposed model are
Basis Function are used in LM and SCG algorithms are used narrated below.
to deliver functions for detection of next suspect’s biography Step 1: Dataset, Preprocessing and Labeling- To
in commercial serial cases. accomplish the task of crime prediction the proposed system
selects quality dataset from the Kaggle public dataset
P. L. Brantingham [12] proposes a novel approach to repository. The Dataset is being downloaded from the URL:
computational modeling of crime patterns and theories in https://www.kaggle.com/cityofLA/los-angeles-crime-arrest-
crime analysis it is based on an abstract state machine (ASM) data#arrest-data-from-2010-to-present.csv.
paradigm. DASM Distributed ASM Model is based on a
multi-agent system that works on set rules and congestion This dataset is uploaded by the city of Los Angeles,
rules. The model is developed in such way that it is not only which is maintained by an organization that has a vast data
applicable to a broad range of crimes but also another territory platform for the respective area. This dataset contains around
such as railway, airports, and mall. ASM serves as the spatial 23 attributes related to crime, which is stored in a spreadsheet.
and temporal aspects of crime in urban areas. This dataset is then fee to the proposed model along with the
user inputs to predict the crime.
Guo Xi [13] introduces the development of The user is facilitated to enter his attributes through an
industrialization and urbanization and its impact, the authors interactive user interface developed in swing framework of
use a model that presents the concept of risk entropy based on Java to predict the possibilities of the crime. The user enters
the Shannon information theory. Public safety is important in attributes like Age, arrest type which is abbreviated as F, I, M
controlling crime prevention. To study the path dependence of and O (The abbreviations here reveal different crime
the calculation method on security network they generated the committed for which the criminal is arrested), Area name, sex
abstract security network simulation. According to the security and charge Description.
network, the risk assessment of crime prevention is done. The The entered user input data are stored as static variables
proposed model can break all the obstacles present in and then the fed dataset in spreadsheet is being read in a
technology for objective risk assessment. double dimension list. From this double dimension list
attribute ‘sex’ and ‘charge Description’ are evaluated for their
Azhari [14] discusses the criminal act of motorcycles unique list to replace them with the indexed label in integers to
theft. As there is growth in population there is an increase in call this list as the Labeled list.
the count of motorcycles as it is one of the needs of the Step 2: K-Means Clustering – The data in the labeled list
humans for traveling Along with these there is an increase in need to be segregated for its nearest properties by clustering
motorcycles theft. It is very difficult to control and monitor on them using K-Means Model. This K means clustering mainly
a regular basis since it requires forecasting and probabilities of performs based on the following steps.
theft for a certain time for the police. In the proposed paper
5

Distance Evaluation – Here for the clustering purpose 3 row of the clusters. The obtained the Y-intercept regression
main attributes are considered from the labeled list like age, value is measured for its maximum and minimum values as
sex and charge Description which are labeled in integer the high and low regression ranges. These regression values
format. These attributes of each of the rows of the labeled list are utilized in the next step of the fuzzy ANN model for the
are subject to evaluate the Euclidean distance between all prediction of crime efficiently.
other rows to calculate their mean to obtain the Row distance.
The obtained row distance R D is appended at the end of the Step 4: Fuzzy Artificial Neural Network- This is the most
each row. The average of all these rows is considered as the important step of the proposed model, where the obtained
Average row distance or Euclidean distance of the complete maximum and the minimum regression values are used to
dataset EDD. This process is carried out by using the equation 1 form the fuzzy crisp ranges. The obtained difference between
and 2. the minimum and maximum regression value is divided by
five to get the quotient. This quotient is used to segregate the
RD=√ (x 1−x 2)2 +( y 1− y 2)2______ (1) five fuzzy crisp ranges like VERY LOW, LOW, MEDIUM,
HIGH and VERY HIGH. For each of the regression clusters
n the rows in the range of HIGH and VERY HIGH crisp values
are considered to be added in a list to call as the ANN input
EDD = ∑ RD ________________________(2) list. This input list is used to estimate further prediction score.
k=0

Where, The obtained ANN input list is used to estimate the


RD- Euclidean distance of a specific row. prediction scores for the major attributes like age and crime
X1, x2, y1, and y2 are the attribute values. type. This is done by the hidden layer and sigmoid functions
EDD = Euclidean distance of the complete dataset as mentioned in equation 4 and 5.
n= Number of Rows
X= (AG* W1) + (AT*w2) +B1 _____ (4)
Centroid Evaluation – The Euclidean distance appended 1
HLV = ________________ (5)
dataset is now sorted in ascending order using the Bubble sort 1+ exp ⁡(− X)
algorithm. K numbers of random row indices are selected
from the sorted list to refer them as the data points. The Where
obtained data points are used to extract the each of the AG= Age, AT = Arrest Type
respective row distance to store them in an array of centroid W1, W2, B1 – Random Weight
list. HLV = Hidden Layer
Boundary evaluation and Cluster Formation - Each of the The obtained hidden layer values are aggregated with the
centroids from the centroid lists are extracted to form the target values of the ANN to obtain the prediction score for
boundary of a cluster. This boundary is evaluated by adding each of the rows of the ANN input list to form the prediction
and subtracting the Euclidean distance of the complete dataset list.
EDD with the centroid distance.
The obtained boundary attracts the specific rows with the
respective row distances to form the K number of clusters. Step 5: Decision Tree for Crime prediction- Procured
These clusters efficiently bring the nearest data together to prediction list from the above step is subjected to yield the
predict the crime efficiently. decision to predict the proneness of the crime for the entered
data by the user. In this step the prediction list is sorted in
Step 3: Linear Regression – The formed clusters are descending order using the bubble sort technique to obtain the
subjected to evaluate the y-intercept for the linear regression best crime proneness possibilities. Here the a count is being
using the equation 3. The attributes of each cluster like age estimated for the entered data for the attributes like
and arrest type are inserted into two integers arrays x[ ] and Y[ areaname,age,sex, charge description and arrest type with the
] respectively. By using these two arrays in equation 3 the input dataset based on the prediction score by the ANN. The
value of Gradient ‘M’ and Intercept ‘B’ is evaluated. measured count is then aggregated to the 100% for the total
attribute size. The aggregated values are estimated for the
Y=MX+B _______________(3) crime proneness based on the 5 ranges like VERY LOW,
LOW, MEDIUM, HIGH and VERY HIGH. And then the
Where obtained result is displayed to the user in an interactive user
M= Gradient interface. This process is depicted through the algorithm 1.
X=Factor of Age
B= Intercept ___________________________________________
Y=Y-Intercept ALGORITHM 1: Decision Tree for Crime Prediction
___________________________________________
The evaluated value of M and B for a cluster is used to //Input : Prediciton List PDL
measure the Y-Intercept values for the factor of age of each // Input: UAN = Area Name. UAG= AGE, USX=Sex
6

// Input: UCD= Charge Description, UAT = Arrest Performance Evaluation based on Precision and Recall
Type
// Output : Prediction String PSTR Precision and Recall enable the derivation of elaborate
1: Start information regarding the performance of the presented
2: PSTR= “ ”, count=0 system. The precision and recall metrics are comprehensive
3: for i=0 to Size of PDL and judicious parameters that can calculate the veritable
4: TL= ∅ [TL = Temporary List] performance of the system. Precision in this evaluation
5: RL= PDL [i] calculates the relative accuracy of the presented technique by
6: AN = RL[4] ,AG = RL[6] ,SX = RL[7], estimating the accurate values of the magnitude of precision
7: CD = RL[10], AT = RL[11] achieved in the proposed system.
8: if (UAN == AN), then count++, end if
9: if (UAG == AG), then count++, end if Precision in this system is being calculated as the ratio of
10: if (USX == SX), then count++, end if the incorporated sum of all the correctly predicted crimes to
11: if (UCD == CD), then count++, end if the number of incorrectly predicted crimes. Therefore, the
12: if (UAT == AT), then count++, end if calculation of the values of precision acquired is an thorough
13: RF= (100 * count) / PDL[size]*5 assessment of the accuracy of the presented system.
14: if( RF >= 0 AND RF<=20), then The Recall metrics used for calculation of the absolute
15: PSTR= VERY LOW accuracy of the approach which is considerably distinct from
16: end if the precision metrics. The Recall metrics are calculated by the
17: if( RF >= 21 AND RF<=40), then assessment of the ratio of the number of accurately predicted
18: PSTR= LOW crimes versus the total number of crime predictions
19: end if performed. This systematic assessment provides insightful
20: if( RF >= 41 AND RF<=60), then knowledge as it calculates the absolute accuracy of the system.
21: PSTR= MEDIUM Precision and recall are detailed mathematically in the
22: end if equations given below.
23: if( RF >= 61 AND RF<=80), then
24: PSTR= HIGH Precision can be mathematically explained as below
25: end if
26: if( RF >= 81 AND RF<=100), then  A = The number of accurately predicted crimes.
27: PSTR= VERY HIGH
28: end if  B= The number of inaccurately predicted crimes
29: end for
30: return PSTR  C = The number of crimes not predicted.
31: Stop
So, precision can be defined as

IV. RESULTS AND DISCUSSIONS Precision = (A / (A+ B)) *100


Recall = (A / (A+ C)) *100
The proposed methodology for realizing an efficient and
accurate Crime Prediction system has been implemented Exhaustive experimentation has been executed on the
through the NetBeans IDE and coded in the Java programming presented methodology through the implementation of the
language. The proposed system has been realized on a laptop equations detailed above. The outcomes of the
consisting of a standard configuration comprising of the Intel experimentation are listed in Table 1, given below.
Core i5 processor for the processing tasks and supplemented
by 500GB of storage and 4GB of physical memory. The
MySQL database server is used to fulfill the database
responsibilities.

In-depth evaluation of the performance of the presented


system was executed using thorough analysis procedures. For
the calculation of the accuracy of the proposed system, the Table 1: Precision and Recall Measurement Table
Precision and Recall metric was used which has the ability to
provide an evaluation regarding the performance of the
presented methodology in detail. The performance metrics
were calculated comprehensively to depict that the approach
for Crime Prediction realized through the Fuzzy Artificial
Neural Networks and the Decision Tree framework in this
paper has been executing as per the expectations.
7

Detection, Analysis & Prediction ” International Conference


on Electronics, Communication, and Aerospace Technology
ICECA 2017.

[2] Abba Babakura, Md Nasir Sulaiman and Mahmud A.


Yusuf,” Method of Classification Algorithms for Crime
Prediction” International Symposium on Biometrics and
Security Technologies (ISBAST) 2014.

[3] S.Sivaranjani, Dr.S.Sivakumari, Aasha.M,” Crime


Prediction and Forecasting in Tamilnadu using Clustering
Approaches” International Conference on Emerging
Technological Trends [ICETT] 2016.

[4] Nafiz Mahmud, Khalid Ibn Zinnah, Yeasin Ar Rahman,


Figure 2: Comparison of Precision and Recall Nasim Ahmed,” CRIMECAST: A Crime Prediction and
Strategy Direction Service” 19th International Conference on
Figure 2 above demonstrates the graphical representation of Computer and Information Technology, December 18-20,
the experimental outcomes. The realization of the presented North South University, Dhaka, Bangladesh 2016.
system for accurate Crime prediction acquires unmatched
accuracy which is apparent through the precision and recall [5] K.R Sai Vineeth, Ayush Pandey, Tribikram Pradhan,” A
assessment. The presented technique acquired the precision of Novel Approach for Intelligent Crime Pattern Discovery and
91.41% and Recall of 91.91% which significantly surpasses Prediction “ 2016 International Conference on Advanced
the first attempt in the traditional techniques for Crime Communication Control and Computing Technologies
prediction utilizing the Fuzzy Artificial Neural Networks and (ICACCCT).
the Decision Tree framework.
[6] Zbigniew M. Wawrzyniak, Zbigniew Szymański,
V. CONCLUSION AND FUTURE SCOPE Stanisław Jankowski, Radosław Pytlak, Grzegorz Borowik,
Eliza Szczechla, Paweł Michalak,” Data-driven models in
The proposed methodology for the purpose of crime machine learning for crime prediction ” 978-1-5386-7834-
prediction has been outlined in this approach effectively. The 3/18 European Union 2018.
methodology is executed on a criminal activity data set that is
first pre-processed before being subjected to the presented [7] Mohammad Nakib, Rozin Tanvir Khan, Md. Shakib ul
system. The pre-processed data is then effectively labeled for Hasan, Jia Uddin,” Crime Scene Prediction by Detecting
easier clustering. The labeled criminal activity data is utilized Threatening Objects Using Convolutional Neural Network ”
in the subsequent step to perform the K- Means clustering to 2012 IEEE 27th Convention of Electrical and Electronics
form effective and relevant clusters for the prediction Engineers in India (2012).
purposes. The clusters are then utilized for regression purposes
through linear regression to weed out the irrelevant data from [8] ChhayaChauhan, SmritiSehgal,” A REVIEW: CRIME
the clusters. The regression list of the clusters obtained from ANALYSIS USING DATA MINING TECHNIQUES AND
this step is utilized for the purpose of neuron creation using ALGORITHMS” International Conference on Computing,
the Fuzzy Artificial Neural networks. After the processing the Communication, and Automation (ICCCA2017).
decision tree is applied on the resultant data to achieve
effective and accurate crime prediction. The experimental [9] ZOU Beiji, Nurudeen Mohammed, ZHU Chengzhang,
results using the precision and recall parameters obtained ZHANG Ziqian,” A Neuro-Fuzzy Crime Prediction Model
91.41% of Precision and 91.91% of Recall in the first attempt Based on Video Analysis∗ ” Chinese Journal of Electronics
which is exceptional. 27, No.5, Sept. 2018

For Future research applications, the proposed [10] Hao Jianzhong, Teng Yufa, Zhang Mingxue, Liu Gang,
methodology can be executed on a real-time criminal activity Gao Wei,” Application of Discrete Orthogonal Combinatorial
from the police records. The accuracy of the methodology can Prediction in Crime Sites” 978-1-4577-2074-1/12/6.00c 2012
be increased further by introduction of even more attributes IEEE.
for the prediction purposes.
[11] Anahita Ghazvini, Mohd Zakree Bin Ahmad Nazri, Siti
Norul Huda Sheikh Abdullah, Md Nawawi Junoh,” Biography
REFERENCES Commercial Serial Crime Analysis Using Enhanced Dynamic
Neural Network ” 978-1-4673-9360-7/15/2015 IEEE.
[1] Sunil Yadav, Meet Timbadia, Ajit Yadav, Rohit
Vishwakarma, and Nikhilesh Yadav,” Crime Pattern
8

[12] P. L. Brantingham, U. Gl¨asser1, B. Kinney, K. Singh1


and M. Vajihollahi,” Computational Model for Simulating
Spatial Aspects of Crime in Urban Environments” Technical
Report SFU-CMPT-TR-2005-14, Simon Fraser University,
July 2005.

[13] Guo Xi1 Hu Ruimin, Peng Yongjun Dai Jingjing,” The


risk assessment of crime prevention system based on risk
entropy model” Second International Conference on Computer
Research and Development.

[14] Azhari, Pradita Eko Prasetyo Utomo,” Prediction the


Crime Motorcycles of Theft using ARIMAX-TFM with Single
Input” Indonesia Journal Comput. Cybernetics System
(IJCCS), Vol.11, No.2, July 2017, pp. 119~130.

*****

You might also like