You are on page 1of 2

2017 6th IIAI International Congress on Advanced Applied Informatics

Using Machine Learning to Assist Crime Prevention


Ying-Lung Lin Tenge-Yang Chen
Department of Information Management Intelligence Integrated Center
Yuan Ze University New Taipei City Police Department
Taoyuan City, Taiwan New Taipei City, Taiwan
fxm900206216@gmail.com sha3098@gmail.com
Liang-Chih Yu
Department of Information Management
Yuan Ze University
Taoyuan City, Taiwan
lcyu@saturn.yzu.edu.tw

Abstract—Drug-related criminal activity is gradually rising month, following month, last year, grids surrounded eight-
in Taiwan and has a significant and negative social impact. direction in the current month, grids surrounded eight-direction
This paper proposes a data-driven method based on “broken last year, tendency, proportion.
windows” theory and spatial analysis to analyze crime data After calculating the value for each feature in the grids, we
using machine mining algorithms and thus predict emerging set drug crime for the following month as the dependent
crime hotspots for additional police attention. variable. We want to predict crime hotspots for the following
The Deep Learning algorithm has been widely applied in month, which are thus not in the same temporal environment.
several fields, include image recognition and natural language Empty grids are removed to prevent model performance
processing. With fine tuning, we find the Deep Learning degradation. For the different time scales, we design 7 sets of
algorithm provides better prediction results than other accumulated data over 1, 3, 6, 9,12, 15, and 18 months.
methods including Random Forest, and Naïve Bayes for We next prepared the data frames to train the models.
potential crime hotspots. Furthermore, we improve model Experiments were run using different algorithms (Deep
performance by accumulating data with different time scales. Learning, Random Forest, and Naïve Bayes) to compare
To validate experimental results, we visualize potential crime prediction results against the proposed method. Naïve Bayes is
hotspots on a map, and observe whether the models can
a simple algorithm based on Bayes’ theorem. Random Forest is
identify true hotspots. Finally, we discuss the applicability of
this method, and present future research directions.
an ensemble algorithm based on decision trees. Deep Learning
is a more recently developed algorithm which provides
Keywords—Crime prevention; Machine learning; Spatial outstanding performance in several fields, include image
analysis recognition and natural language processing Deep Learning
I. INTRODUCTION operates using Artificial Neural Networks and thus benefits
Drug-related crime is gradually increasing in Taiwan, and from improved computing capability to perform more complex
the police require more powerful tools to combat and prevent calculations and improve model tuning.
such crimes. Traditional approaches are highly dependent on Fig. 1. Spatial-temporal analysis workflow.
the experience of senior police officers, which presents
challenges for generalizing methods and information. Junior
police officers lack sufficient experience to identify potential Step 1
crime hotspots, thus losing opportunities to prevent crime. Step 2
Step 3 & 4
g1 1
g2
We present a data-driven method based on “broken g1
windows” theory and spatial analysis. Broken windows theory g3 g4
g4
posits that failure to respond to low level criminal activity in an
area will lead to more serious crimes. Based on this theory, we
design a model that predicts the incidence of drug-related
crime in the following month based on incidence of drug-
related crime, fraud, assault, intimidation, auto theft, and Step 1: Split the map Mġby grids gn,ġMġľ焊g1ĭg2ĭg3ĭɃĭgn焌.
burglary in the current month. Accordingly, we can extend the Step 2: Each grid g has the features fn , gġľ焊f1ĭf2ĭf3ĭɃĭfn焌, and calculate
model with its spatial-temporal characteristics. Let each grid the value of each feature in the grid.
which splitting from the map regard as a sample, and Step 3: Drug-related crimes which will happen in the following month are
accumulates samples in the same time scale to construct denoted as Y.
matrices. Each matrix represents different spatial-temporal If no drug-related crime occurs in the grid, the grid is a coldspot, and
Y = 0.
status, thus, we can use the matrix to train and test by machine Otherwise, the grid is a hotspot, and Y =ġ1.
learning algorithms. Step 4: If the sum of the grid is zero, it’s an empty grid that should be
II. METHOD removed.
As shown by the workflow in Fig. 1, we spilt the map into
grids, at different scales: 500m*500m (10,843 cells), Finally, we evaluated model performance in terms of
750m*750m (4,877 cells) and 1000m*1000m (2,817 cells). We accuracy, precision, recall and f-measure. We want to find the
then combined 56 features with eight different types of crime high accuracy and high f-measure results. Accuracy refers to
and seven different spatial-temporal patterns including current how correctly the model predicts hotspots and coldspots

978-0-7695-6178-3/17 $31.00 © 2017 IEEE


978-1-5386-0621-6/17 1026
1029
DOI 10.1109/IIAI-AAI.2017.46
overall, and f-measure represents the model’s ability to find the D. Visualization
correctly hotspots. As shown in Fig. 5, predicted and actual hotspots were
Fig. 2. Configuration of Deep Learning algorithm. plotted on a map. We found that the model accurately predicts
most hotspots, thus validating its utility.
Layers Configuration
Input layer : 56 Activation function : Rectifier Fig. 5. Visualization maps of drug-related crime hotspots.
Hidden layer : 40,40,30,30,20,20 Learning rate :0.001
(1)Actual hotspots (2)Predicted hotspotsġ
Output layer : 2 Epochs :30

III. EXPERIMENTAL RESULTS


A. Data and Analysis Tools
We use the R and h2o package in this experiment, and use
actual crime data to validate results. Model prediction results
are then visualized with QGIS. Unfortunately, we are unable
to release the data due to its sensitive nature, but experimental
results are available on Github(https://goo.gl/qVoHDo). ġġ
(3)Actual hotspotsġġġġġġġġġġġġġġġġġġġġġġġ(4) Predicted hotspots
B. Performance of Different Grid Areas
As shown in Fig. 3, larger grids provide better performance.
The number of grids decreases as grid area increases, thus
raising the probability of the grid being a hotspot. Although
model performance increases with grid area, larger areas make
it difficult for police to precisely locate the hotspot, thus
reducing practical utility.
Tuning the model with the Deep Learning algorithm
(configurations as shown in Fig. 2) provides the best
performance, the accuracy, precision, recall and f-measure ġġ
decrease less than other algorithms. In contrast, the standard
methods which let hotspots last month as hotspots next month, IV. CONCLUSION
it’s recall decreases sharply. We demonstrate a machine learning method designed to
provide improved prediction of future crime hotspots, with
Fig. 3. Average performance of different grid areas. results validated by actual crime data. The model tuned using
Deep Learning provides the best performance. Visualizations
of predicted hotspots can assist patrol planning and improve
crime prevention.
Effectively combating and preventing drug-related
crime requires continuous improvements to data-driven
methods. Future efforts will seek to integrate additional data
sources reflecting economic and traffic condition to better
simulate actual conditions. Additional sources of unstructured
data, such as narrative reports, could thus potentially be
integrated to further improve crime prevention efforts.
REFERENCES
[1] Wilson, J. Q., & Kelling, G. L. “Broken windows.” Critical issues in
policing: Contemporary readings, pp. 395-407, 1982.
C. Performance of Accumulating Time Scales [2] Chen, H., Chung, W., Xu, J. J., Wang, G., Qin, Y., & Chau, M. “Crime
As shown in Fig. 4, accumulated data allows the models to data mining: a general framework and some examples.”Computer, 37(4),
learn from past periods, thus slightly improving the pp. 50-56. , 2004.
[3] Yu, C. H., Ward, M. W., Morabito, M., & Ding, W. “Crime forecasting
performance of all models. Deep Learning provides the best using data mining techniques.” In Data Mining Workshops (ICDMW),
performance in accumulating 15 months worth of data, 2011 IEEE 11th International Conference, pp. 779-786, December 2011.
producing a clear improvement in accuracy and f-measure. [4] Yu, C. H., Ding, W., Chen, P., & Morabito, M. “Crime forecasting using
spatio-temporal pattern with ensemble learning.” In Pacific-Asia
Fig. 4. Average performance of accumulating time scales.
Conference on Knowledge Discovery and Data Mining, Springer
International Publishing, pp. 174-185, May 2014.
[5] Wang, D., Ding, W., Lo, H., Stepinski, T., Salazar, J., & Morabito, M.
“Crime hotspot mapping using the crime related factors—a spatial data
mining approach.” Applied intelligence, 39(4), pp. 772-781, 2013.
[6] Murray, A. T., McGuffog, I., Western, J. S., & Mullins, P. “Exploratory
spatial data analysis techniques for examining urban crime implications
for evaluating treatment.” British Journal of criminology, 41(2), pp. 309-
329, 2001.
[7] Candel, A., Parmar, V., LeDell, E., & Arora, A. “Deep Learning with
H2O.” H2O.ai Inc., 2016.

1030
1027

You might also like