Visualization of navigation patterns on a Web site using model-based clustering
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
A new approach to analyzing gene expression time series data
Proceedings of the sixth annual international conference on Computational biology
Bayesian Clustering by Dynamics
Machine Learning - Special issue: Unsupervised learning
On Clustering Validation Techniques
Journal of Intelligent Information Systems
Distance Measures for Effective Clustering of ARIMA Time-Series
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
PKDD '97 Proceedings of the First European Symposium on Principles of Data Mining and Knowledge Discovery
Mixtures of ARMA Models for Model-Based Time Series Clustering
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration
Data Mining and Knowledge Discovery
A symbolic representation of time series, with implications for streaming algorithms
DMKD '03 Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
Model-based Clustering with Soft Balancing
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
A Bit Level Representation for Time Series Data Mining with Shape Based Similarity
Data Mining and Knowledge Discovery
General Hierarchical Model (GHM) to measure similarity of time series
ACM SIGMOD Record
Experiencing SAX: a novel symbolic representation of time series
Data Mining and Knowledge Discovery
A bayesian mixture model with linear regression mixing proportions
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Stock Price Forecasting by Combining News Mining and Time Series Analysis
WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Incremental clustering of gesture patterns based on a self organizing incremental neural network
IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
A new class of attacks on time series data mining\m{1}
Intelligent Data Analysis
Classification of household devices by electricity usage profiles
IDEAL'11 Proceedings of the 12th international conference on Intelligent data engineering and automated learning
SciQL: bridging the gap between science and relational DBMS
Proceedings of the 15th Symposium on International Database Engineering & Applications
A multi-hierarchical representation for similarity measurement of time series
PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Motion-Alert: automatic anomaly detection in massive moving objects
ISI'06 Proceedings of the 4th IEEE international conference on Intelligence and Security Informatics
A likelihood ratio distance measure for the similarity between the fourier transform of time series
PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
A novel bit level time series representation with implication of similarity search and clustering
PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
A symbolic representation method to preserve the characteristic slope of time series
SBIA'12 Proceedings of the 21st Brazilian conference on Advances in Artificial Intelligence
Preserving Privacy in Time Series Data Mining
International Journal of Data Warehousing and Mining
Hi-index | 0.00 |
Clustering time series is a problem that has applications in a wide variety of fields, and has recently attracted a large amount of research. In this paper we focus on clustering data derived from Autoregressive Moving Average (ARMA) models using k-means and k-medoids algorithms with the Euclidean distance between estimated model parameters. We justify our choice of clustering technique and distance metric by reproducing results obtained in related research. Our research aim is to assess the affects of discretising data into binary sequences of above and below the median, a process known as clipping, on the clustering of time series. It is known that the fitted AR parameters of clipped data tend asymptotically to the parameters for unclipped data. We exploit this result to demonstrate that for long series the clustering accuracy when using clipped data from the class of ARMA models is not significantly different to that achieved with unclipped data. Next we show that if the data contains outliers then using clipped data produces significantly better clusterings. We then demonstrate that using clipped series requires much less memory and operations such as distance calculations can be much faster. Finally, we demonstrate these advantages on three real world data sets.