Algorithms on strings, trees, and sequences: computer science and computational biology
Algorithms on strings, trees, and sequences: computer science and computational biology
ACM Computing Surveys (CSUR)
A new approach to analyzing gene expression time series data
Proceedings of the sixth annual international conference on Computational biology
On Clustering Multimedia Time Series Data Using K-Means and Dynamic Time Warping
MUE '07 Proceedings of the 2007 International Conference on Multimedia and Ubiquitous Engineering
Off-line signature verification using DTW
Pattern Recognition Letters
Bioinformatics
Scaling and time warping in time series querying
The VLDB Journal — The International Journal on Very Large Data Bases
Inaccuracies of Shape Averaging Method Using Dynamic Time Warping for Time Series Data
ICCS '07 Proceedings of the 7th international conference on Computational Science, Part I: ICCS 2007
Clustering of time series data-a survey
Pattern Recognition
Survey of clustering algorithms
IEEE Transactions on Neural Networks
Summarizing a set of time series by averaging: From Steiner sequence to compact multiple alignment
Theoretical Computer Science
Classification of surgical processes using dynamic time warping
Journal of Biomedical Informatics
Spatio-temporal reasoning for the classification of satellite image time series
Pattern Recognition Letters
A hierarchical semantic-based distance for nominal histogram comparison
Data & Knowledge Engineering
Campaign extraction from social media
ACM Transactions on Intelligent Systems and Technology (TIST) - Special Section on Intelligent Mobile Knowledge Discovery and Management Systems and Special Issue on Social Web Mining
Stock market co-movement assessment using a three-phase clustering method
Expert Systems with Applications: An International Journal
Hi-index | 0.01 |
Mining sequential data is an old topic that has been revived in the last decade, due to the increasing availability of sequential datasets. Most works in this field are centred on the definition and use of a distance (or, at least, a similarity measure) between sequences of elements. A measure called dynamic time warping (DTW) seems to be currently the most relevant for a large panel of applications. This article is about the use of DTW in data mining algorithms, and focuses on the computation of an average of a set of sequences. Averaging is an essential tool for the analysis of data. For example, the K-means clustering algorithm repeatedly computes such an average, and needs to provide a description of the clusters it forms. Averaging is here a crucial step, which must be sound in order to make algorithms work accurately. When dealing with sequences, especially when sequences are compared with DTW, averaging is not a trivial task. Starting with existing techniques developed around DTW, the article suggests an analysis framework to classify averaging techniques. It then proceeds to study the two major questions lifted by the framework. First, we develop a global technique for averaging a set of sequences. This technique is original in that it avoids using iterative pairwise averaging. It is thus insensitive to ordering effects. Second, we describe a new strategy to reduce the length of the resulting average sequence. This has a favourable impact on performance, but also on the relevance of the results. Both aspects are evaluated on standard datasets, and the evaluation shows that they compare favourably with existing methods. The article ends by describing the use of averaging in clustering. The last section also introduces a new application domain, namely the analysis of satellite image time series, where data mining techniques provide an original approach.