Querying and mining of time series data: experimental comparison of representations and distance measures

Authors:
Hui Ding;Goce Trajcevski;Peter Scheuermann;Xiaoyue Wang;Eamonn Keogh
Affiliations:
Northwestern University, Evanston, IL;Northwestern University, Evanston, IL;Northwestern University, Evanston, IL;University of California, Riverside, CA;University of California, Riverside, CA
Venue:
Proceedings of the VLDB Endowment
Year:
2008

Citing 29
Cited 92

Fast subsequence matching in time-series databases

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Efficiently supporting ad hoc queries in large datasets of time sequences

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
A comparison of DFT and DWT based similarity search in time-series databases

Proceedings of the ninth international conference on Information and knowledge management
Locally adaptive dimensionality reduction for indexing large time series databases

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
On Comparing Classifiers: Pitfalls toAvoid and a Recommended Approach

Data Mining and Knowledge Discovery
Efficient Retrieval of Similar Time Sequences Under Time Warping

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Fast Time Sequence Indexing for Arbitrary Lp Norms

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
An Index-Based Approach for Similarity Search Supporting Time Warping in Large Sequence Databases

Proceedings of the 17th International Conference on Data Engineering
A Similarity Search Method of Time Series Data with Combination of Fourier and Wavelet Transforms

TIME '02 Proceedings of the Ninth International Symposium on Temporal Representation and Reasoning (TIME'02)
On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration

Data Mining and Knowledge Discovery
Warping indexes with envelope transforms for query by humming

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Discovering Similar Multidimensional Trajectories

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Discovery of climate indices using clustering

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Indexing spatio-temporal trajectories with Chebyshev polynomials

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Exact indexing of dynamic time warping

Knowledge and Information Systems
Robust and fast similarity search for moving object trajectories

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques
Introduction to Data Mining, (First Edition)

Introduction to Data Mining, (First Edition)
Using multi-scale histograms to answer pattern existence and shape match queries

SSDBM'2005 Proceedings of the 17th international conference on Scientific and statistical database management
Fast time series classification using numerosity reduction

ICML '06 Proceedings of the 23rd international conference on Machine learning
Indexing Multidimensional Time-Series

The VLDB Journal — The International Journal on Very Large Data Bases
A decade of progress in indexing and mining large time series databases

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
An efficient and accurate method for evaluating time series similarity

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Experiencing SAX: a novel symbolic representation of time series

Data Mining and Knowledge Discovery
Exact indexing of dynamic time warping

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
On the marriage of Lp-norms and edit distance

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Indexable PLA for efficient similarity search

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Evaluation of similarity searching methods for music data in P2P networks

International Journal of Business Intelligence and Data Mining
A study of cross-validation and bootstrap for accuracy estimation and model selection

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2

iSAX: disk-aware mining and indexing of massive time series datasets

Data Mining and Knowledge Discovery
Sustainable operation and management of data center chillers using temporal data mining

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Constraint-Based Learning of Distance Functions for Object Trajectories

SSDBM 2009 Proceedings of the 21st International Conference on Scientific and Statistical Database Management
Anticipatory DTW for efficient similarity search in time series databases

Proceedings of the VLDB Endowment
Distortion-free predictive streaming time-series matching

Information Sciences: an International Journal
Pattern detector: fast detection of suspicious stream patterns for immediate reaction

Proceedings of the 13th International Conference on Extending Database Technology
Predicting service request rates for adaptive resource allocation in SOA

Proceedings of the International Workshop on Enterprises & Organizational Modeling and Simulation
Searching trajectories by locations: an efficiency study

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Tropical cyclone event sequence similarity search via dimensionality reduction and metric learning

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
DUST: a generalized notion of similarity between uncertain time series

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Online discovery and maintenance of time series motifs

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
DUST: a generalized notion of similarity between uncertain time series

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Exact indexing for massive time series databases under time warping distance

Data Mining and Knowledge Discovery
Top-k queries on temporal data

The VLDB Journal — The International Journal on Very Large Data Bases
A videosurveillance data browsing software architecture for forensics: from trajectories similarities to video fragments

Proceedings of the 2nd ACM workshop on Multimedia in forensics, security and intelligence
An efficient approach for human motion data mining based on curves matching

ICCVG'10 Proceedings of the 2010 international conference on Computer vision and graphics: Part I
A brief survey on sequence classification

ACM SIGKDD Explorations Newsletter
Accurate subsequence matching on data stream under time warping distance

PAKDD'09 Proceedings of the 13th Pacific-Asia international conference on Knowledge discovery and data mining: new frontiers in applied data mining
A framework for time-series analysis

AIMSA'10 Proceedings of the 14th international conference on Artificial intelligence: methodology, systems, and applications
Shift-invariant grouped multi-task learning for Gaussian processes

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III
A review on time series data mining

Engineering Applications of Artificial Intelligence
Patterns of temporal variation in online media

Proceedings of the fourth ACM international conference on Web search and data mining
Time series shapelets: a novel technique that allows accurate, interpretable and fast classification

Data Mining and Knowledge Discovery
A disk-aware algorithm for time series motif discovery

Data Mining and Knowledge Discovery
TIDES--a new descriptor for time series oscillation behavior

Geoinformatica
Fast retrieval of time series using a multi-resolution filter with multiple reduced spaces

ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications: Part I
Synopses for probabilistic data over large domains

Proceedings of the 14th International Conference on Extending Database Technology
Dynamic time warping constraint learning for large margin nearest neighbor classification

Information Sciences: an International Journal
Finding semantics in time series

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Temporal data mining approaches for sustainable chiller management in data centers

ACM Transactions on Intelligent Systems and Technology (TIST)
Logical-shapelets: an expressive primitive for time series classification

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Scalable kNN search on vertically stored time series

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Fusion of similarity measures for time series classification

HAIS'11 Proceedings of the 6th international conference on Hybrid artificial intelligent systems - Volume Part II
INSIGHT: efficient and effective instance selection for time-series classification

PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part II
Efficient processing of multiple DTW queries in time series databases

SSDBM'11 Proceedings of the 23rd international conference on Scientific and statistical database management
Approximate query on historical stream data

DEXA'11 Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part II
ShiftTree: an interpretable model-based approach for time series classification

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part II
Mining significant time intervals for relationship detection

SSTD'11 Proceedings of the 12th international conference on Advances in spatial and temporal databases
Similarity matching for uncertain time series: analytical and experimental comparison

Proceedings of the 2nd ACM SIGSPATIAL International Workshop on Querying and Mining Uncertain Spatio-Temporal Data
SciQL: bridging the gap between science and relational DBMS

Proceedings of the 15th Symposium on International Database Engineering & Applications
Shape-based template matching for time series data

Knowledge-Based Systems
Visually exploring movement data via similarity-based analysis

Journal of Intelligent Information Systems
Detecting potential collusive cliques in futures markets based on trading behaviors from real data

Neurocomputing
Hidden markov model-based time series prediction using motifs for detecting inter-time-serial correlations

Proceedings of the 27th Annual ACM Symposium on Applied Computing
An integrated approach for healthcare planning over multi-dimensional data using long-term prediction

HIS'12 Proceedings of the First international conference on Health Information Science
Similarity in (spatial, temporal and) spatio-temporal datasets

Proceedings of the 15th International Conference on Extending Database Technology
Distributed distance matrix generator based on agents

Proceedings of the 2nd International Conference on Web Intelligence, Mining and Semantics
Mining temporal patterns in popularity of web items

Information Sciences: an International Journal
Significant motifs in time series

Statistical Analysis and Data Mining
Searching and mining trillions of time series subsequences under dynamic time warping

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
A shapelet transform for time series classification

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Mining of temporal coherent subspace clusters in multivariate time series databases

PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Scalable similarity matching in streaming time series

PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
sDTW: computing DTW distances using locally relevant constraints based on salient feature alignments

Proceedings of the VLDB Endowment
Uncertain time-series similarity: return to the basics

Proceedings of the VLDB Endowment
Alternative quality measures for time series shapelets

IDEAL'12 Proceedings of the 13th international conference on Intelligent Data Engineering and Automated Learning
Time-series mining in a psychological domain

Proceedings of the Fifth Balkan Conference in Informatics
Time-series data mining

ACM Computing Surveys (CSUR)
Time series classification by class-specific Mahalanobis distance measures

Advances in Data Analysis and Classification
Rotation-invariant similarity in time series using bag-of-patterns representation

Journal of Intelligent Information Systems
Decision forest: an algorithm for classifying multivariate time series

International Journal of Business Intelligence and Data Mining
Modeling topic trends on the social web using temporal signatures

Proceedings of the twelfth international workshop on Web information and data management
An efficient and simple under-sampling technique for imbalanced time series classification

Proceedings of the 21st ACM international conference on Information and knowledge management
Genetic algorithms-based symbolic aggregate approximation

DaWaK'12 Proceedings of the 14th international conference on Data Warehousing and Knowledge Discovery
A benchmark for content-based retrieval in bivariate data collections

TPDL'12 Proceedings of the Second international conference on Theory and Practice of Digital Libraries
Invariant time-series classification

ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
A symbolic representation method to preserve the characteristic slope of time series

SBIA'12 Proceedings of the 21st Brazilian conference on Advances in Artificial Intelligence
Using derivatives in time series classification

Data Mining and Knowledge Discovery
A representation of time series based on implicit polynomial curve

Pattern Recognition Letters
Accelerating subsequence similarity search based on dynamic time warping distance with FPGA

Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
Visual-interactive querying for multivariate research data repositories using bag-of-words

Proceedings of the 13th ACM/IEEE-CS joint conference on Digital libraries
DTW-D: time series semi-supervised learning from a single example

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Towards never-ending learning from time series streams

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Identifying dynamics and collective behaviors in microblogging traces

Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
Addressing Big Data Time Series: Mining Trillions of Time Series Subsequences Under Dynamic Time Warping

ACM Transactions on Knowledge Discovery from Data (TKDD) - Special Issue on ACM SIGKDD 2012
Searching time series with Hadoop in an electric power company

Proceedings of the 2nd International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications
Instance selection for time series classification based on immune binary particle swarm optimization

Knowledge-Based Systems
Multiresolution similarity search in time series data: an application to EEG signals

Proceedings of the 6th International Conference on PErvasive Technologies Related to Assistive Environments
Pattern discovery in data streams under the time warping distance

The VLDB Journal — The International Journal on Very Large Data Bases
Time series representation: a random shifting perspective

WAIM'13 Proceedings of the 14th international conference on Web-Age Information Management
ABC-SG: a new artificial bee colony algorithm-based distance of sequential data using sigma grams

AusDM '12 Proceedings of the Tenth Australasian Data Mining Conference - Volume 134
Enhancing learning algorithms to support data with short sequence features by automated feature discovery

Knowledge-Based Systems
Data mining a trillion time series subsequences under dynamic time warping

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Instruction set extensions for dynamic time warping

Proceedings of the Ninth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis
Discovering longest-lasting correlation in sequence databases

Proceedings of the VLDB Endowment
Stock market co-movement assessment using a three-phase clustering method

Expert Systems with Applications: An International Journal
Agreement-based fuzzy C-means for clustering data with blocks of features

Neurocomputing
The influence of global constraints on similarity measures for time-series databases

Knowledge-Based Systems
A new similarity measure based on shape information for invariant with multiple distortions

Neurocomputing
CID: an efficient complexity-invariant distance for time series

Data Mining and Knowledge Discovery
Matching Observed with Empirical Reality --What you see is what you get?

Fundamenta Informaticae - Dedicated to the Memory of Professor Manfred Kudlek
Classification of time series by shapelet transformation

Data Mining and Knowledge Discovery

Quantified Score

Hi-index	0.00

Visualization

Abstract

The last decade has witnessed a tremendous growths of interests in applications that deal with querying and mining of time series data. Numerous representation methods for dimensionality reduction and similarity measures geared towards time series have been introduced. Each individual work introducing a particular method has made specific claims and, aside from the occasional theoretical justifications, provided quantitative experimental observations. However, for the most part, the comparative aspects of these experiments were too narrowly focused on demonstrating the benefits of the proposed methods over some of the previously introduced ones. In order to provide a comprehensive validation, we conducted an extensive set of time series experiments re-implementing 8 different representation methods and 9 similarity measures and their variants, and testing their effectiveness on 38 time series data sets from a wide variety of application domains. In this paper, we give an overview of these different techniques and present our comparative experimental findings regarding their effectiveness. Our experiments have provided both a unified validation of some of the existing achievements, and in some cases, suggested that certain claims in the literature may be unduly optimistic.