You are on page 1of 6

(IJACSA) International Journal of Advanced Computer Science and Applications,

Vol. 1, No. 6, December 2010

An Attribute Oriented Stimulate Algorithm for

Detecting and Mapping Crime Hot Spots
M.VijayKumar Dr.C.Chandrasekar,
Assistant Professor, Department of Computer Reader, Department of Computer Science
Applications Periyar University
KSR College of Engineering Salem
Tiruchengode-637 215

Abstract—Crime mapping is a very effective method for Recently crime analysis has tended towards the focus on
detecting high-crime-density areas known as hot spots. Crime hot crime hot spot detecting. Crime hot spot is an area where there
spot is an area where the number of criminal or disorder events are more criminal or disorder events than any other places, or
is larger than that in any other places, or an area where people an area where people have a higher risk of victimization [1].
have a higher risk of victimization. There are many theories and Hot spot analysis can help police identify high-crime areas,
methods in common use by far. They explain different types of types of crime being committed, and make suggestions for the
crime phenomena that occur at different geographic levels. The best way of response. Crime hot spot analysis is usually made
method which is used most widely for detecting crime hot spots is by crime pin maps of reported crime events over a certain
the spatial clustering in the original crime data. The application period. With the development of GIS based on computer
of spatial optimized-division to attempting to distinguish crime system, this spatial analysis has become more and more
hot spots has a long tradition in crime mapping. These methods
practical and efficient.
do not represent the actual spatial distribution of crime and often
mislead the researchers into focusing on areas of low crime Many theories and methods about exploring hot spots have
importance within an optimized-division. And they also depend been established and applied. Some theories help explain point
on strict precondition; complex computing and a number of concentrations of crime known as place-based theories. Other
parameters must be entered. So it is very necessary and theories help explain linear concentrations of crime known as
important to preprocess the spatial information of original data. street theories or hot spot crime polygons known as
The original data include crime events such as event time, event neighborhood theories and large area theories. There are many
class, event spatial information and event object. These data have
methods for detecting hot spots on maps. The famous one is
many attributes at different levels. So the attribute oriented
preliminary global statistical method which includes mean
stimulate method is chosen to deal with these data. According to
the experiments of using traditional attribute oriented stimulate center, standard deviation distance, standard deviation ellipse,
method, the results show that some crime event attributes are clustering, spatial auto correlation method etc. These theories
overly generalized. This paper presents an improved attribute and methods indicate point mapping such as repeat places hot
oriented stimulate method and algorithm related to crime hot spots by graduated symbols, color gradient dots and repeat
spots detecting , and a simple mapping method is depicted. addresses, detect crime time hot spots and repeat streets by
Experiments show that the map is clearer and more specific than simple dot mapping, and identify neighborhoods and other area
those by the traditional methods. hot spots by dots, lines, ellipse, choropleth or isoline mapping.
Some of existing theories and methods are usually applied
Keywords- Crime mapping; hot Spot; Spatial Clustering; Attribute
to mapping spatial information of original data throughout
Oriented stimulate
maps. It causes some limitations. One is that using original data
brings lower efficiency; the other is that choosing equal theory
I. INTRODUCTION and method is very strict. So this research indicates the
Crime mapping and analysis have been developed solution. First we preprocess the spatial information of original
significantly over the past 30 years. Many officers and agencies data (crime event), and then use improved attribute oriented
visualized individual crime events and crime distribution early induce method for clustering event attributes. Finally we map
by using all kinds of pins on city and area maps. With the fast the extracted points to show specific hot spots.
advancement of GIS techniques, visualizing and detecting the
order of criminal activities become more necessary and II. CRIME HOT SPOT MAPPING
effective. GIS techniques provide many methods for combining
spatial information with other data. Furthermore, GIS is one of A. A. Hot spot Theories
the most influential tools for integrating a wide variety of Depending on the different perspectives, the current
spatial information and facilitating exploration of crime theories include variety of forms. The most important and
distribution. common theories are place-based theories. Place -based
theories fall squarely within the theoretical tradition of social

24 | P a g e
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 1, No. 6, December 2010

environmental science, but are more specific about the The most common approach to displaying geographic
mechanisms by which structural context is translated into patterns of crime is point mapping [7]. In the specific
individual action. The dominant theoretical perspectives are application, if these individual geographic point objects are
derived from the routine activities theory [2] and rational suitably attributed with information, such as the code
choice theory [3]. Place-based theories explain why crime describing the type, date, and time of offense, sets of points
events occur at specific locations. They deal with crimes that meeting particular conditions can be simply and quickly
occur at the lowest level of analysis—specific places. Crime selected. And then these selections can be displayed by suitable
phenomena at this level occur as points, so the appropriate symbol representing the crime category displayed. Point maps
units of analysis can be addresses, street corners, and other very in common use include point density map, spatial ellipse map
small places, which would be typically represented on maps as and so on. All these mapping methods are relative to points or
dots. Many researches relative to crime hot spot analysis are places data.
based on these theories. There are some other theories such as
street, neighborhood and area theories. They deal with different Another technique for representing crime spatial
crime data at various levels. Because they are not the cores of distribution is geographic boundary thematic mapping. These
this research, the details of them will not be discussed. geographic boundaries are usually defined as administrative or
political areas e.g. police authority, block, district, city region,
and province region. Crime events mapped as points can be
B. B. Types of Crime Hot Spots aggregated in these geographic region areas. The useful
According to the theories above, there are many types of geographic boundary maps are quadrat thematic maps,
hot spots correspondingly such as repeat places hot spots, interpolation maps and quartic kernel density estimation maps
repeat streets hot spots, and area hot spots. Among them the made by the continuous surface smoothing methods. These
high crime places hot spots (repeat places hot spots) are used mapping methods are based on area or region data.
frequently. These hot spots are the places with high criminal
incidence. These places can be addresses, street corners, stores, D. Restrictions
houses, street segments or any other small locations. When
looking for these hot places, dot maps are superior to other Currently the most useful method for detecting crime hot
forms of mapping. Although hot places are often concentrated spots is the spatial clustering. The application of spatial ellipses
within areas, they are often separated by other places with few to attempting to distinguish crime hot spots has a long tradition
or no crimes. These hot spots are best depicted by dots, so the in crime mapping. The nearest neighbor hierarchical clustering
methods and techniques of crime analysis and mapping are and K-means clustering are typical methods in spatial ellipses
based on the points. There are many other kinds of hot spots techniques. These ellipses represent some hot spots. But crime
although they are not showed as points. They usually indicate hot spots are not natural spatial ellipses forms. So these
the „hot‟ thing which frequently happens or must be paid much methods do not represent the actual spatial distribution of crime
attention to. and often mislead the researchers into focusing on areas of low
crime importance within an ellipse. These methods also depend
on strict requirement, complex computing and a number of
C. C. Techniques and Methods parameters must be entered. The inappropriate parameters will
1) Detecting crime hot spot lead indistinguishable final output.
We have many methods and techniques to understand and
describe patterns of hot spots in crime data so far. All these Because the data in these methods are not preprocessed, in
methods are based on statistical theory. The typical one of them the next section the attribute oriented stimulate clustering
is preliminary global statistics technique. It includes mean would be introduced in order to deal with original data and then
center method, standard deviation method, standard deviation to overcome the limitations above.
ellipse and clustering method. The clustering method is well
known because of its contribution to data mining and it is III. ATTRIBUTE ORIENTED STIMULATE METHOD
probably most useful method of preliminary global statistics. Attribute oriented stimulate method is the relatively easy
Crime analysts often assume that crime distribution is clustered method of the concept clustering. The method aims to detect
and is complete spatial uncertainty. So the clustering may be more similar object in the data sets same as general clustering‟s
used most widely till now. The attribute oriented induce goal. The main idea of attribute oriented stimulate method
method in this paper is also based on clustering. The nearest includes some steps as following. The first is querying and
neighbor index and Z-score method [4] are also based on collecting data relevant to the task, the second is inspecting
clustering. each attribute of task-related data and the third is replacing the
The spatial auto correlation is another useful method which attribute values with more intangible values repeatedly. The
includes Moran‟s I. method and Geary‟s C statistic method [5] more intangible value is defined in the generalized hierarchy
[6]. They test whether the distributions of point events are graph in advance. Before explaining the method it is necessary
related to each other. Positive spatial autocorrelation is said to to analyze the data (crime events) in detail.
exist where events are clustered or where events that are close
together have more similar values than those farther apart. A. Crime Events
In this research the events or cases include criminal cases,
2) Mapping crime hot spot
administrative cases and public security cases. All kinds of

25 | P a g e
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 1, No. 6, December 2010

events are the cases which record the detail in the period from
crime committing to crime response. The case is sometimes
described like „somebody was robbed a cell phone which cost
Rupees Ten Thousand in the Home Inn at Gandhi Road of
Peking on a certain time between 7.00 PM and 8:00 PM
Wednesday, JUNE 1st 2010‟. In different countries the
descriptions of crime events are various. However, the basic
records include when the event happens, what the kind of event
is, who the victim is, how much money is lost, where the locale
of event is, what the precise description of case is and so on.
They look a little complex. But among them some are
fundamental attributes, such as numerical attribute, time
attribute, category attribute, and string attribute.
Figure 2. classification for crime types
[1] Examples of numerical attributes include counters,
money involved of cases.
 Examples of time attributes include crime happened
time, crime disposed time, crime solved time etc. 

 Examples of category attributes include violations of
property, robbery, hijack, offenders, organizations
etc. 

 Examples of string attributes assume arbitrary and
unforeseeable text values such as case descriptions
and case detection processes. 
In order to define the similarities between crime events,
classification on the attributes can be used. Classification is the
generalization ladder on these attributes. These attributes above
can be depicted by classification respectively. For example, a
classification might state that Railway station is a transport site, Figure 3. classification for crime time
is a great transient population place... Fig.1, 2, 3 show the B. Classical Attribute Oriented stimulate Algorithm
classifications as examples for crime time, crime places and
crime categories respectively. Before introducing the attribute oriented stimulate
algorithm some concepts and principles must be interpreted in
a nutshell.
Let S denote a set of crime events, |S| is the number of
crime events. Let Ai be an attribute of events and Ci be a
classification of each attribute Ai. For two elements x1, x2 in the
tree of Ci, if there is a path from x1 to x2, x 1 is called parent of
x2. Furthermore, x1 is a generalization of x2. In the attribute
oriented stimulate clustering method the concept dissipation is
used instead of similarity. The dissipation between the events is
smaller and the events are more similar. Let (S) denote the
dissipation of S. So the crime event clustering problem is to
Figure 1. classification for crime places find a set G that minimizes the dissipation (G), and subject to
the constraint that |C| ≥min_size holds. Here the min_size is a
threshold user defined. According to the reduced of CLIQUE
problem, the event clustering is NP-complete [8].
If DB (S 1, S2, …, Sm) is the set of crime events where Sm
={A1, A2,… A n},and the attributes which are not used of the
event are deleted. Cn is the classification of the attributes left. A
classical algorithm is represented as follow [9]:
Classical algorithm:
1: Begin:
2: For (each Si)

26 | P a g e
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 1, No. 6, December 2010

3: delete not to comprehensive attribute Ai by Cn

7: While ( events in DB‟ (S) can be generalized ) do
4: While ( events in DB (S) can be generalized ) do 8: choose an attribute Ai by heuristic method
5: choose an attribute Ai 9: For ( each Si)
6: For ( each Si) 10: S.Ai =parents (S.Ai ) //simplify an attribute
7: S.Ai =parents (S.Ai ) //generalize an 11: For ( random S1, S2 )
12: If ( S1=S2 )
8: For ( random S1, S2 )
13: S1.Count=S1.Count+S2.Count
9: If ( S1=S2 )
14: delete S2 //join the same events
10: delete S2 //merge the same events 15: If ( a proper event exists )
11: Endwhile 16: Produce a hot spot
12: End 17: Delete the event in DB’
In order to prove the practicality of the algorithm above the 18: For ( each Si left in DB‟)
crime events of a certain provincial capital of China in 2009 are
chosen to make a test. 19: S=DB (S) // Cancel the generalization

The number of these events is about one hundred thousand. 20: Endwhile
They are saved in the ASJ_T table. The classifications of 21: End
places, time, case categories, sex, money involved are applied
in the test. The results of test are not satisfactory because some D. Results of Test
hot spots detected are so abstract and lose the information By using the enhanced algorithm some events (hot spots)
really necessary. The crime hotspots are mostly the formats like are created. They are:
SMALL, WORKDAY}. These crime hot spots are useless for  Crime time hot spots: in the results of the test, there
police and other agencies to manage and control the social are many time related events such as
security situation. The reasons for the emergence of these {RECREATION SITE, ANY-CASE, WEEKEND,
formats are that every attributes are overly generalized. The
classical algorithm uses the classifications to induce attribute as
far as possible to the high hierarchy. So the algorithm must be NIGHT, SUMMER}, {TRANSPORT SITE, THEFT,
improved by some adjustments. WORKDAY, HOLIDAYS}, and {ANY-PLACES,
HIJACK, NIGHT, SUMMER} etc. They represent
C. Enhanced Attribute Oriented stimulate Algorithm that in summer on the night of each weekend the
personal injury cases are happened frequently. So the
A heuristic method is used to solve the problem preceding. police should enhance the patrol in the midnight
In the first a count value is defined, and the initial value is 1. hours. 
The value is the number of the events in clustering (hot spots).  Crime place hot spots: in the results of the test, there
If the number of some same events is greater than the threshold are a lot of events concentrating on {RAILWAY
value, these events are not generalized and saved as hot spots STATION, FRAUD, ANY-DAY-OF-WEEK, ANY-
directly. Secondly if the number of events is greater than
threshold value after once generalization on an attribute by the
classification of the attribute, the attribute should be
comprehensive up a hierarchy until the number of events on the
The relationship between transport site and theft
attribute is smaller than threshold value. And then the attribute
cases, services site and theft cases, railway station
with the greatest same value is chosen in the end. Finally to
avoid generalizing excessively the generalization of all the and fraud cases are clear that these site related are the
attributes must be canceled once a hot spot is produced. The places hot spots. So the police should increase the
deployment in these places. 
improved algorithm is depicted as below: 
 Crime class hot spots: some events depict the „hot‟
Improved algorithm: crime classes in the results. The top three classes are
1: Begin: theft, fraud and robbery. These hot classes are only
noticed in the global level and usually at the time of
2: DB‟=DB //make a copy of data anti-crime campaign. 

3: For (each Si‟ in DB‟)  Crime victim hot spots: if the taxonomies of sex, age
4: S.Count=1 //initialize the count attributes are applied in the algorithm the „hot‟
victims can be explored at the same time. The results
5: For (each Si)
6: delete not to comprehensive attribute Ai by Cn

27 | P a g e
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 1, No. 6, December 2010

SUMMER, FEMALE, YOUNG PEOPLE}, and C. Grid thematic mapping

{BUSSINESS, FRAUD, WORKDAY, ANY-DAY- This approach does have some limitations; the usage of
OF-YEAR, ANY-GENDER, OLD PEOPLE}. That grids still restricts how the hotspots can be displayed. Spatial
means the victim of fraud is usually old women and detail within and across each quadrat is correspondingly lost
the young women are vulnerable and threatened by because the crime events have to conform to one specific
robbery, assault and so on.  quadrat, which can then lead to inaccurate interpretation by the
Identifying hotspots is the first step a policing or crime
reduction agency needs to take when discerning where best to
prioritize their resources. Attempting to do this via point
mapping has become outdated since the propagation of GIS
software and the increasing classiness of mapping techniques.
In this section, we describe four of the most common hotspot
mapping techniques.

A. Spatial ellipses
One of the earliest crime mapping software applications
that became widely available to practitioners for crime analysis Figure6. grid thematic mapping
was Spatial and Temporal Analysis of Crime (STAC). STAC is
a spatial tool to find and examine hotspot areas within the study D. Kernel density estimation
area. In concise terms this means that STAC first finds the Kernel density estimation (KDE) is regarded as the most
densest concentration of points on the map (hot clusters), and suitable spatial analysis technique for visualizing crime data.
then fits a “standard deviational ellipse” to each one. The This is an increasingly popular method due to its growing
ellipses themselves indicate through their size and alignment availability (such as the MapInfo add-on Hotspot Detective),
the nature of the underlying crime clusters. the perceived accuracy of hotspot identification and the visual
look of the resulting map in comparison to other techniques.
Point data are aggregated within a user specified search radius
and a continuous surface that represents the density or volume
of crime events across the desired area is calculated. A smooth
surface map is produced, showing the variation of the
point/crime density across the study area, with no need to
conform to geometric shapes such as ellipses.

Figure4. standard deviational spatial ellipses

B. Thematic mapping of geographic boundary areas

A widely used way of representing spatial distributions of
crime events is geographic boundary thematic mapping.
Boundary areas that are used for this type of thematic mapping
are usually arbitrarily defined for administrative, for example Figure7. KDE.
they can be police beats, census blocks, wards or districts.
Offences as points on a map can be aggregated to these V. HOT SPOTS MAPS
geographic unit areas that can then be shaded in accordance After the hot spots detecting has been completed, this
with the number of crimes that fall within them. section is ready for mapping these hot spots. Because the focus
of this research lies in the processing of data set, the method in
this research is simpler than the traditional methods of mapping
hot spots. The main ideas of traditional methods are that
analyzing the spatial relationship between every events directly
by original data. It needs complex computing and the efficiency
is relatively low.
In this paper a simple method for showing the hot spots
detected on the maps is depicted. For the crime time, crime
class, crime victim hot spots and other hot spots with no spatial
information directly, the pie charts or histograms and other
Figure5. thematic mapping of administrative units thematic pictures can be used. These pictures superimpose on

28 | P a g e
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 1, No. 6, December 2010

the area of patrol and illustrate the most important and urgent Statistics, vol 13, pp 25-29, 1964.
information naturally to the police. [5] L. Anselin, and A. Getis, “Spatial statistical analysis and geography
information system,” Annals of Regional Science, Springer, vol 26, pp
For the crime place hot spots those events clustered have 19-33, 1992.
been saved in temp table of database. Each event has its spatial [6] N. Levine, “CrimeStat 2.0, A Spatial Statistics Programm for the
Analysis of Crime Incident Locations,” National Institute of Justice,
information such as latitude and longitude and each hot spot Washington, pp: 12-50, 2002.
has its event dataset. To mark a certain crime place hot spot on (references)
the map, the center of all the events in this cluster should be [7] E. Jefferis, “A Multi-Method Exploration of Crime Hot Spots: A
found firstly. Then the mean distance of these points is defined Summary of Findings,” National Institute of Justice, Washington, pp: 8,
as the radius of a circle. The circle denotes a crime place hot 1999.
spot which includes various kinds of crime information and can [8] C. H. Papadimitrious, “Computational Complexity,” Addison-Wesley,
be interpreted in the myth of map. Fig. 4 shows a crime place 1994.
hot spot map. The map is based on the distribution of all the [9] J. Han, Y. Cai, and N. Cercone, “Knowledge Discovery in Database:
crime dots and the ellipse clusters are built together. An Attribute-Oriented Approach,” 18th VLDB Conference, Vancouver,
Canada, pp:547-559, 1992.
[10] A. H. Pilevar, M. Sukumar, “GCHL: A grid- clustering algorithm for
high-dimensional very large spatial data bases”, Pattern Recognition
Letters, 2005, pp: 999-1010.
[11] L. Kaufman and P. Jrousseeuw, Finding Group in Data: An Introduction
to Cluster Analysis, 1990, New York.
[12] R. Ng., J. Han, “Efficient and effective clustering method for spatial data
mining”, Very Large Data Bases, Santiago, 1994, pp: 144-155.
[13] S. Guha, R. Rastogi, and K. Shim, “An efficient clustering algorithm for
large databases”, Management of Data, SIGMOD’98, Seattle, 1998, pp:
[14] T. Zhang, R. Ramakrishnan, and M. Livny, “An efficient data clustering
method for large databases”, Management of Data, Seattle, 1996, pp:
[15] M. Ester, H. P. Kriegel, and X. Xu, “A density- based algorithm for
discovering clusters in large spatial databases”, Knowledge Discovery
and Data Mining, Portland, 1996, pp: 226-231.
[16] M. nkerst, M. breunig, H. P. Kriegel, and J. Sander, “Ordering points to
identify the clustering structure”, Management of Data, Philadelphia,
Figure 8. Crime place hot spots and ellipse clusters
1999, pp: 49-60.
[17] A. Hinneburg and D. A. Keim, “An efficient approach to clustering in
large multimedia databases with noise”,Knowledge Discovery and Data
Mining, New York, 1998, pp: 58-65.
VI. CONCLUSION [18] W. Wang, J. Yang, and R. Muntz, “A statistical information grid
This paper presented an approach for detecting the crime approach to spatial data mining”, Very Large Data Bases, Greece, 1997,
hot spots by attribute oriented stimulate method. For this pp: 186-195.
purpose some kinds of crime cases were analyzed for [19] Xiang Zhang; Zhiang Hu; Rong Li; Zheng Zheng; “Detecting and
mapping crime hot spots based on improved attribute oriented induce
clustering. This method is not as same as the traditional spatial clustering” Geoinformatics,2010, 18th International conference,Beijing,
clustering. It focuses on preprocessing the crime events before pp: 1-5.
mapping. The advantage of this method is that the precise
information of hot spots is more apparent. But many researches
about this method will be explored in the future. The
classification of attribute is the key related to the accuracy of
the results. So the design and optimization of classification
should be studied further. The mapping method in this paper is
relatively simple. And improving the mapping method is based
on KDE work in future.
[1] A. R. Gonzales, R. B. Schofield and S.V. Hart, “Mapping Crime:
Understanding Hot Spots,” National Institute of Justice Report,
Washington, August 1999, pp. 2.
[2] J. Cohen, E. Lawrence, and M. Felson, “Social change and crime rate
trends: A routine activity approach,” American Sociological Review,
Nashville, vol 44, 1979, pp 588-605.
[3] L. E. Cornish, and R. V. Clarke. “The reasoning criminal: Rational
choice perspectives on offending,” Springer-Verlag, New York, 1986.

[4] E. G. Knox, “The detection of space-time interactions,” Applied

29 | P a g e