Computer Science Department, Sri Ramakrishna College of Arts and Science forWomen,Coimbatore,Tamilnadu,India.
Sri Ramakrishna College of Arts & Science for Women, Coimbatore ,Tamil Nadu, India.
Mining Time Series data has a tremendousgrowth of interest in today’s world. To provide anindication various implementations are studied andsummarized to identify the different problems in existingapplications. Clustering time series is a trouble that hasapplications in an extensive assortment of fields and hasrecently attracted a large amount of research. Time seriesdata are frequently large and may contain outliers. Inaddition, time series are a special type of data set whereelements have a temporal ordering. Therefore clustering of such data stream is an important issue in the data miningprocess. Numerous techniques and clustering algorithmshave been proposed earlier to assist clustering of time seriesdata streams. The clustering algorithms and its effectivenesson various applications are compared to develop a newmethod to solve the existing problem. This paper presents asurvey on various clustering algorithms available for timeseries datasets. Moreover, the distinctiveness and restrictionof previous research are discussed and several achievabletopics for future study are recognized. Furthermore theareas that utilize time series clustering are also summarized.
Data Mining, Data Streams, Clustering, TimeSeries, Machine Learning, Unsupervised Learning, FeatureExtraction and Feature Selection.I.I
Today Time Series data management has become aninteresting research topic by the data miners. Particularly,the clustering of time series has attracted the interest of researchers. Data mining is usually constrained by threelimited resources. They are Time, Memory and Samplesize. Recently time and memory seem to be bottleneck formachine learning application. Clustering is an unsupervisedlearning process for grouping a dataset into subgroups. Adata stream is an ordered sequence of points x
, , , , , ,x
.These data can be read or accessed only once or a smallnumber of times. A time series is a sequence of realnumbers, each number indicating a value at a time point.Data flows continuously from a data stream at high speed,producing more examples over time in recent real worldapplications. Traditional algorithms cannot support to thehigh speed arrival of time series data. This is a reason; thenew algorithms have been developed for real timeprocessing data.Time series data are being generated at an unique speedfrom almost every application domain e.g., Dailyfluctuations of stock market, Fault diagnosis, Dynamicscientific experiments, Electrical power demand, positionupdates of moving objects in location based services,various reading from sensor networks, Biological andMedical experimental observations, etc. Traditionallyclustering is taken as a batch procedure. Most of theclustering techniques can be two major categories. One isPartitional clustering and another one is HierarchicalClustering . They are the two key aspects for achievingeffectiveness and efficiency when using time series data. Atime series experiment requires multiple arrays which allmakes it very expensive. Dimensionality reductiontechniques can be divided into two groups (i) FeatureExtraction (ii) Feature Selection. Feature Extractiontechniques extract a set of new features from the originalattributes. Feature Selection is a process that selects asubset of original attributes. There have been numeroustextbooks  and publications on clustering of scientificdata for a variety of areas such as taxonomy, agriculture ,remote sensing , as well as process control . Thispaper presents a survey on various clustering algorithmsavailable for time series datasets. Moreover, thedistinctiveness and restriction of previous research arediscussed and several achievable topics for future study arerecognized. Furthermore the areas that time seriesclustering have been applied to are also summarized.The remainder of the paper is organized as follows.Section 2 reviews the concept of time series and gives anoverview of the algorithms of different techniques. Section3 marginally discusses possible future extensions of thework. Section 4 concludes the paper with fewer discussions.II.R
Quite a number of clustering techniques has beenproposed earlier for time series data streams. This section of
Clustering Time Series Data Stream – A LiteratureSurve
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 8, No. 1, April 2010289http://sites.google.com/site/ijcsis/ISSN 1947-5500