A global averaging method for dynamic time warping, with applications to clustering

Authors:
François Petitjean;Alain Ketterlin;Pierre Gançarski
Affiliations:
University of Strasbourg, 7 rue René Descartes, 67084 Strasbourg Cedex, France and LSIIT-UMR 7005, Pôle API, Bd Sébastien Brant, BP 10413, 67412 Illkirch Cedex, France and Centre Na ...;University of Strasbourg, 7 rue René Descartes, 67084 Strasbourg Cedex, France and LSIIT-UMR 7005, Pôle API, Bd Sébastien Brant, BP 10413, 67412 Illkirch Cedex, France;University of Strasbourg, 7 rue René Descartes, 67084 Strasbourg Cedex, France and LSIIT-UMR 7005, Pôle API, Bd Sébastien Brant, BP 10413, 67412 Illkirch Cedex, France
Venue:
Pattern Recognition
Year:
2011

Citing 10
Cited 7

Algorithms on strings, trees, and sequences: computer science and computational biology

Algorithms on strings, trees, and sequences: computer science and computational biology
Data clustering: a review

ACM Computing Surveys (CSUR)
A new approach to analyzing gene expression time series data

Proceedings of the sixth annual international conference on Computational biology
On Clustering Multimedia Time Series Data Using K-Means and Dynamic Time Warping

MUE '07 Proceedings of the 2007 International Conference on Multimedia and Ubiquitous Engineering
Off-line signature verification using DTW

Pattern Recognition Letters
PROMALS

Bioinformatics
Scaling and time warping in time series querying

The VLDB Journal — The International Journal on Very Large Data Bases
Inaccuracies of Shape Averaging Method Using Dynamic Time Warping for Time Series Data

ICCS '07 Proceedings of the 7th international conference on Computational Science, Part I: ICCS 2007
Clustering of time series data-a survey

Pattern Recognition
Survey of clustering algorithms

IEEE Transactions on Neural Networks

Summarizing a set of time series by averaging: From Steiner sequence to compact multiple alignment

Theoretical Computer Science
Classification of surgical processes using dynamic time warping

Journal of Biomedical Informatics
Extraction of complex patterns from multiresolution remote sensing images: A hierarchical top-down methodology

Pattern Recognition
Spatio-temporal reasoning for the classification of satellite image time series

Pattern Recognition Letters
A hierarchical semantic-based distance for nominal histogram comparison

Data & Knowledge Engineering
Campaign extraction from social media

ACM Transactions on Intelligent Systems and Technology (TIST) - Special Section on Intelligent Mobile Knowledge Discovery and Management Systems and Special Issue on Social Web Mining
Stock market co-movement assessment using a three-phase clustering method

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.01

Visualization

Abstract

Mining sequential data is an old topic that has been revived in the last decade, due to the increasing availability of sequential datasets. Most works in this field are centred on the definition and use of a distance (or, at least, a similarity measure) between sequences of elements. A measure called dynamic time warping (DTW) seems to be currently the most relevant for a large panel of applications. This article is about the use of DTW in data mining algorithms, and focuses on the computation of an average of a set of sequences. Averaging is an essential tool for the analysis of data. For example, the K-means clustering algorithm repeatedly computes such an average, and needs to provide a description of the clusters it forms. Averaging is here a crucial step, which must be sound in order to make algorithms work accurately. When dealing with sequences, especially when sequences are compared with DTW, averaging is not a trivial task. Starting with existing techniques developed around DTW, the article suggests an analysis framework to classify averaging techniques. It then proceeds to study the two major questions lifted by the framework. First, we develop a global technique for averaging a set of sequences. This technique is original in that it avoids using iterative pairwise averaging. It is thus insensitive to ordering effects. Second, we describe a new strategy to reduce the length of the resulting average sequence. This has a favourable impact on performance, but also on the relevance of the results. Both aspects are evaluated on standard datasets, and the evaluation shows that they compare favourably with existing methods. The article ends by describing the use of averaging in clustering. The last section also introduces a new application domain, namely the analysis of satellite image time series, where data mining techniques provide an original approach.