Professional Documents
Culture Documents
1
Krishnendu S.G, 2 Lakshmi P.P, 3 Nitha L
1
PG Student, 2 PG Student, 3 Assistant Professor
Department of Computer Science and IT,
Amrita School of Arts and Sciences,Kochi
Amrita Vishwa Vidhyapeetham, India
1
krishnendusg1@gmail.com, 2 pplakshmi14@gmail.co m, 3 nitha.leeladharan@gmail.com
Abstract-In India, the crime rate is increasing each best algorithms that resolve a familiar clustering
day. In the current situation, recent technological problem. An optimized K-means method for data
influence, effects of social media and modern
clustering will reduce changing the cluster values
approaches help the offenders to achieve their crimes.
after a certain amount of repetitions. The cluster
Both analysis and prediction of crime is a
number depends on the k value and finding the
systematized method that classifies and examines the
crime patterns. There exist various clustering optimum k value will lead to picking a better
algorithms for crime analysis and pattern prediction centroid value.
but they do not reveal all the requirements. Among As of now, crimes in India are increasing in a rapid
these, K means algorithm provides a better way for rate. Our analysis considers a larger dataset and
predicting the results. The proposed research work manually selecting the k value from this huge data
mainly focused on predicting the region with higher set is a very difficult process and takes high time
crime rates and age groups with more or less criminal and complexity. To overcome this problem, we
tendencies. We propose an optimized K means
choose elbow method to find the k value. To
algorithm to lower the time complexity and improve
implement optimized K-means clustering in
efficiency in the result.
Keywords- crime, clustering, optimized k-means python, we use the spider software.
algorithm.
II. LITERATURE REVIEW
Phase-I: Initial centroids are calculated in this We are working on Spyder for implementation.
phase. This is done by using Elbow method Here we use a Spyder 3.7 version. Spyder is an
algorithm for finding optimum k value integrated development environment for systematic
programming in Python. Here we implemented
Algorithm 1: The k-means clustering algorithm [4] different packages like matplotlib,numpy,sklearn,
pandas,etc. Which helps to plot elbow graph and
data frame table using a K-means clustering
Input:
algorithm?
Dataset is collected from Kaggle datasets and
T = {T1, T2... T10}
import datasets into Spyder in CSV format as
shown in Fig 1. We perform normalization for
Output:
finding the accurate number of clusters (k) using
the elbow method. The elbow method performs k-
Set of 8 clusters.
means clustering on the obtained dataset for a range
of values of k (2-15) and calculates the SSE. A line
K=8
chart of the SSE is plotted for each value of k.
The dataset is shown in Fig 1 consists of crimes in I The clustering result are demonstrate below,
ndia. This dataset contains broad evidence about va
rious crimes that took place in India during 2001-2 Total male criminals are more in cluster 2 ,i.e. valu
010. The dataset contain 1053 values and normaliza e 13746.
tion is performed to rescale the dataset. Female criminals are more in cluster 6
Male below 18 years age are more in cluster4.
The resultant pre-processed dataset goes through k- Female below 18 years age are more in cluster 4.
means clustering technique. An initial stage for an
unsupervised algorithm should be finding out the o After clustering fig 4 shows the added clusters as th
ptimal number of clusters into which the data can b e second attribute to the original dataset and it is sh
e clustered Plot the curve of total_within_ss accordi own below
ng to the number of clusters k.To find out the best a
nd accurate k value we randomly give numbers fro
m 2 to 15.Here from this graph we analyze the chan
ges that happened in
8th value so we take the value as k=8.
CONCLUSION
REFERENCES