You are on page 1of 4

International Institution for Technological Research and Development

Volume 1, Issue 1, 2015

Review of various Clustering Methods used to


categorize seismic data into Earthquake and
Mining Blast
Kashyap Agravat Madhushree B. Jasmine Jha
L.J. Institute of Engg. & Tech. L.J. Institute of Engg. & Tech. L.J. Institute of Engg. & Tech.
Ahmedabad, Gujarat, India Ahmedabad, Gujarat, India Ahmedabad, Gujarat, India
kashyapagravat111@gmail.com bmadhushree@yahoo.com jhajasmine@gmail.com

Abstract Earthquake is natural event that holding the energy in it. Mainly earthquake produces P- and S- wave at the point of
occurrence. The behavior of these waves is important to analyze the event. Mining site, that mine the gold, rock or sand from
earth often makes the query blast to the surface of earth for accelerate the mining process instead of drilling. This mining blast
also causes the same wave as the earthquake. Seismogram is not able to differentiate between two of them. Even the amplitude
and signals of these two events are same, that cannot separate from each other. Sometimes these types of mining sites are located
near to residential area and discrimination of earthquake and query blast data is important to deeply analyze the geological
activities and awareness purpose. Data clustering is one of the effective techniques that differentiate the data in number of cluster,
the having the similarities in it are grouped in one cluster. Also the similarities and characteristics of data in one cluster is
different from other data of other cluster. ISR is one of the government organization that capture the real time seismic data using
highly sensitive sensors, these sensors are also measure the mining blast data and vehicle vibration and produce the waveform
same as the earthquake waveform. Base on these, the scientist of ISR construct a blue print of region, later this blue print is
helpful to construct a buildings. The scientist also has the same problem as separation of earthquake data that are mixed in query
blast and vehicle vibration.
In this paper, we studied several clustering and other data discrimination method, by analyzing its complexity parameters and its
limitations, some of it can be applied to discrimination of earthquakes and query blast data.

Index Terms—Seismic data clustering, comparison of clustering methods, earthquake data clustering, similarity measures, seismic data
differentiation

any class attribute associated with them. Clustering is


widely used as one of the important steps in the exploratory
I. INTRODUCTION1 data analysis. Clustering algorithms are used to find the
useful and unidentified classes of pattern. Clustering is used
A Challenge in seismic monitoring is to uniquely
discriminate between natural seismicity and
anthropogenic events such as mining blasts. Two basic
to divide the data into groups of similar objects. The objects
that are dissimilar are placed into separate cluster.
types of elastic waves are generated from seismic events Depending upon the metric chose, a data object may belong
like an earthquake and blast, these seismic events contains to a single cluster or it may belong to more than one cluster.
[2]
mainly two waves; P- and S-. These waves cause shaking
that is felt, and cause damage in various ways. It is good to We studied several papers and provide comparative
focus on S- wave spectra because they have a good signal- analysis of several clustering algorithm which can be
to-noise (STN) ratio over much wider bandwidth than the S- helpful to differentiate earthquake and explosion blast data
wave spectra. Magnitude of lower earthquake and query that have same parameters and region of events.
blast may be same. Sometimes heavily loaded vehicles also
cause high magnitude that can’t be discriminated. Our focus
on this paper is to study various clustering methods that can
be used to differentiate such types of seismic mixed data.
Clustering is defined as the unsupervised classification
of the data items or the observations i.e. the data sets have
not been classified into any group and so they do not have

1
International Institution for Technological Research and Development
Volume 1, Issue 1, 2015

Fig. 1. Shows the vertical component seismogram of two earthquakes (a, b) and two quarry blasts (c, d) recorded by the
local seismic network of Agadir[8]

Hierarchical-Based, Density-Based, Grid-Based and Model-


II. LITERATURE REVIEWS
Based, which are further subdivided and shown in figures
2.1.
In [2], Garima et al. discussed clustering techniques and
divided them into major categories: partitioning-Based,

FIG 2.1: Classifications of Clustering Algorithms [2]


partitioning methods for mining are not applicable, there are
In partitioning clustering method, data sets are divided some improvements requires in traditional methods.
into number of partition. Later these partitions are refers as
cluster. This method has its characteristics like each cluster T.Hitendra Sharma, P.Viswanath and B.Eswara Reddy
or partition must have at least one object and no introduced A Fast Approximation Kernal k-means
overlapping. Dendogram – a tree of cluster that is Clustering Method for large Data set in [3]. They presented
constructed in Hierarchical method. This tree is based on algorithm in three steps. First they used kernel based leaders
medium of proximity. Grid based techniques are used in clustering method to finds a set of prototypes. The output of
spatial applications where the large space is divided into kernel based leaders clustering method are applied as a
number of cells. input of k-means method of clustering which produces a
We have large and continuous data set of earthquake, query partition of prototypes that are early generated. Finally, they
blast and vibration of vehicles, so the other method like get the partition of the entire dataset by replacing each
Density-Based, Grid-Based , Model-Based and traditional prototype by its followers.
International Institution for Technological Research and Development
Volume 1, Issue 1, 2015

An improved k-medoids method for clustering large data scale, event-based telemation data sets collected via a
set is more effective compare to traditional k-medoids satellite-based tracking system. They saw in their paper that
method, said by Danyang Cao and Bingre in [4]. They the moving objects are challenging to analyze because of
modified k-medoids clustering algorithm and constructed the enormous amount of data, the data quality and the
improved k-medoids clustering algorithm which is based on approximate nature of the spatial data type. They firstly
the clustering features of BIRCH algorithm. They preserve indexed the trips based on a grid indexing method, and then
all the training data in a CF-Tree, and then they apply k- they compared only trips sharing the same grid
medoids to cluster the CF in leaf nodes of CF-Tree. neighborhood instead of an exhaustive pair-wise
Eventually they get k clusters from the root of the CF-Tree. comparison of all the trips to get advantage of grid indexing
The time complexity, scalability on large dataset and is to significantly decrease the size of the data space needed
convex space of this algorithm is better as compare to k- to run the distance computation during the hierarchical
medoids algorithm. clustering process.
HUANG Hanming, LI Rui and LU Shi Jun from Guangxi In 2006, YANG Peijie et al. in [7], used Fuzzy Clustering
Normal University, Guilin,China, introduced method to approach for seismic data analysis. In their research and
Discrimination of earthquakes and Explosions using chirp-Z study work they used Fuzzy Clustering method, which is
Transform spectrum Features.[5] In their study, they overall useful to locating clusters embedded in background noise.
spectrum layout is acquired by Fourier transform (FFT). Professor Jim Bezdek originally introduces this technique in
Now, based on this overall layout seismic signals, suitable 1981. [7] they presented that the algorithm attempts to
frequency range in the spectrum which contains most partition a finite collection of elements into a collection of
discriminative information are selected, then the proposed ‘c’ fuzzy cluster with respect to some given criterion.
Chirp-Z transform is applied to get finer resolution
spectrum, and at last they achieved more accurate spectrum
features corresponding. They mainly describes an III. COMPARISION OF CLUSTERING METHODS
algorithm employs CZT to derive dominant frequency and In this section we have analyzed various clustering
associated average energy. They select 40 earthquake events
algorithms. We fill that some of clustering techniques can
and 40 explosion events occur in a near neighborhood
be useful to differentiate the seismological data like
region in north China.
earthquake and mining blast and represented in Table 1.
In [6], Qing et al. introduced a scalable clustering
algorithm to discover frequently repeated trips from large

TABLE 1. ANALYSIS ON SEVERAL CLUSTERING TECHNIQUES AND ITS APPLICATIONS

Authors Technique/ Dataset Advantages Applicable Reason


Clustering Method or not
T.Hitendra Kernal k-means homogeneous Simple to No There may be some loss in the
Sharma, Clustering Method dataset implement , more quality of clustering. For our data
P.Viswanath reliable result set it is quit impossible to define
B.Eswara compare to the threshold(t) as they used
Reddy traditional k-means
algorithm
Dalal, k-Means clustering Numerical Large datasets are No Sensitive to noise, Depends on
Harale data processed easily, initial value of k, poor locally
(crisp data simple to implement optimal solution
set) and results are easy
to interpret
Danyang Improved k- Homogeneous Improves the No Deriving CF-Tree from data set
Cao, medoids algorithm dataset drawbacks of the k- requires much effort and also data
Bingre medoids algorithm, may be mixed during the
such as the time implementation of this method
complexity,
scalability on large
dataset, and can’t
find the cluster of
size different very
International Institution for Technological Research and Development
Volume 1, Issue 1, 2015

much and the


convex shapes
HUANG chirp-Z Transform Homogeneous Improve the Yes Useful for limited numbers of
Hanming, spectrum Features dataset accuracy of the reading , as the events and its
LI Rui, discrimination reading added, the accuracy and
LU Shi Jun problem time complexity is reduced
Qing Cao, Grid-based Real time Proposed algorithm No Same pattern reorganization in
Bourchra clustering method homogeneous significantly reduces earthquake and query blast pattern
Bouquta, dataset the computational (data),
Patricia time needed for More fairly applicable on spatial
Mackenxie, clustering in data
Daniel Hierarchical
Messier, clustering algorithm
Josheph
J.Salvo
YANG Peijie, Fuzzy clustering Training Overlapping Yes By reducing limitation of limited
YIN Xingyao,
dataset clusters are formed number of parameters,
ZHANG
Guangzhi due to usage of crossploting problem and
membership improvisation, it can be used to
function, Number of differentiate earthquake data from
cluster are not to be mining blast
defined prior
[3] T.Hitendra Sarma and P.Viswanath, B.Eswara Reddy, “A Fast
Approximate Kernel k-means Clustering Method For Large Data
sets”, Department of Computer Science and Engineering, Rajeev
I. CONCLUSION Gandhi Memorial College of Eng. And Technology, Nandyal-
The field of clustering analysis as one of the key 518501, Department of Computer Science and Engineering, JNTUA
College of Engineering, Anantapur-515002, A.P., India, 978-1-4244-
technology of data mining has broad development 9477-4/11-2011 IEEE
prospects. In terms of clustering analysis, this paper mainly [4] Danyang Cao, Bringru Yang, “An improved k-medoids clustering
introduces several clustering methods that can be applied to algorithm”, College of Information Engineering, North China
University of Technology, NCUT, Beijing 100144, College of
differentiate large seismic data in ISR(Institute of Information Engineering Beijing University of Science and
Seismological Research) that contains same frequency Technology, BUST, Beijing 100083, China, 978-1-4244-5586-7/10-
magnitude of earthquake, query blast and vehicle vibration 2010 IEEE.
[5] Huang Hanming, LI Rui and LU Shi Jun, BIAN Yin Ju,
data. “Discrimination of Earthquakes and Explosions Using Chirp-Z
By studying various methods of clustering we found Transform Spectrum Features*”, College of Computer Science and
that traditional clustering algorithm like simple partitioning, Information Engineering, Guangxi Normal University, Guilin,
541004, Institute of Geophysics, China Seismological Bureau,
density based method, grid based and hierarchical clustering
Beijing, 100081, China, 2009 World Congress on Computer Science
methods are not enough to differentiate earthquake from and Information Engineering, DIO 10.1109/CSIE.2009.696.
vibration and explosion blast data. Fuzzy clustering [6] Qing Cao, Bourchra Bouquta, Patricia D. Mackenxie, Daniel
algorithm may be useful by reducing its limitations of Messier, Josheph J.Salvo, “A Grid-Based Clustering Method For
Mining Frequent Trips From Large-Scale, Event-Based Telematics
limited numbers of parameters. Datasets”, Computing and Decision Science, GE Global Research
Center, One Research Circle, Niskayuna, NY 12309,Proceedings of
ACKNOWLEDGMENT the 2009 IEEE Internal Conference on Systems, Man, and
Cybernetics San Antonio, TX, USA-October 2009.
This work is supported by Institute of Seismological [7] YANG Peijie, YIN Xingyao, ZHANG Guangzhi, “Seismic Data
Research (ISR) , Gandhinagar, India. Analysis Based on Fuzzy Clustering,” China University of
Petroleum, Shandong China 257067, ICSP2006 Proceedings.

REFERENCES [8] Han, Jiawei, and Micheline Kamber. Data Mining.


Amsterdam:Elsevier, 2006. Print
[1] Ahalya.G., Hari Mohan Pandey, “Data Clustering Approaches
Survey and Analysis,” CSE department ASET, Amity University
Noida, India, 2015 1st International Conference on Futuristic trend in
Computational Analysis and Knowledge Management (ABLAZE-
2015).
[2] Garima, Hina Gulati, P.K.Singh, “Clustering Techniques in Data
Mining: A Comparison,” Amity Uniersity, Uttar Pradesh Noida,
INDIA, 978-9-3805-4416-8/15-2015IEEE

You might also like