Approximate clustering of time series using compact model-based descriptions

Authors:
Hans-Peter Kriegel;Peer Kröger;Alexey Pryakhin;Matthias Renz;Andrew Zherdin
Affiliations:
Institute for Computer Science, Ludwig-Maximilians-University of Munich, Munich, Germany;Institute for Computer Science, Ludwig-Maximilians-University of Munich, Munich, Germany;Institute for Computer Science, Ludwig-Maximilians-University of Munich, Munich, Germany;Institute for Computer Science, Ludwig-Maximilians-University of Munich, Munich, Germany;Institute for Computer Science, Ludwig-Maximilians-University of Munich, Munich, Germany
Venue:
DASFAA'08 Proceedings of the 13th international conference on Database systems for advanced applications
Year:
2008

Citing 12
Cited 2

Fast subsequence matching in time-series databases

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Efficiently supporting ad hoc queries in large datasets of time sequences

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Data mining: concepts and techniques

Data mining: concepts and techniques
Locally adaptive dimensionality reduction for indexing large time series databases

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
R-trees: a dynamic index structure for spatial searching

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
On Clustering Validation Techniques

Journal of Intelligent Information Systems
Efficient Similarity Search In Sequence Databases

FODO '93 Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms
Fast Time Sequence Indexing for Arbitrary Lp Norms

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Efficient Time Series Matching by Wavelets

ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Indexing spatio-temporal trajectories with Chebyshev polynomials

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Identifying periodically expressed transcripts in microarray time series data

Bioinformatics
A novel bit level time series representation with implication of similarity search and clustering

PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining

Analysis of time series using compact model-based descriptions

DASFAA'08 Proceedings of the 13th international conference on Database systems for advanced applications
Rotation-invariant similarity in time series using bag-of-patterns representation

Journal of Intelligent Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Clustering time series is usually limited by the fact that the length of the time series has a significantly negative influence on the runtime. On the other hand, approximative clustering applied to existing compressed representations of time series (e.g. obtained through dimensionality reduction) usually suffers from low accuracy. We propose a method for the compression of time series based on mathematical models that explore dependencies between different time series. In particular, each time series is represented by a combination of a set of specific reference time series. The cost of this representation depend only on the number of reference time series rather than on the length of the time series. We show that using only a small number of reference time series yields a rather accurate representation while reducing the storage cost and runtime of clustering algorithms significantly. Our experiments illustrate that these representations can be used to produce an approximate clustering with high accuracy and considerably reduced runtime.