Pattern discovery in data streams under the time warping distance

Authors:
Machiko Toyoda;Yasushi Sakurai;Yoshiharu Ishikawa
Affiliations:
NTT Communication Science Laboratories, Kyoto, Japan;NTT Communication Science Laboratories, Kyoto, Japan;Information Technology Center, Nagoya University, Aichi, Japan
Venue:
The VLDB Journal — The International Journal on Very Large Data Bases
Year:
2013

Citing 64
Cited 1

The R*-tree: an efficient and robust access method for points and rectangles

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Fundamentals of speech recognition

Fundamentals of speech recognition
Fast subsequence matching in time-series databases

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Finding patterns in time series: a dynamic programming approach

Advances in knowledge discovery and data mining
Dimensionality reduction for similarity searching in dynamic databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Automatic modeling of a 3D city map from real-world video

MULTIMEDIA '99 Proceedings of the seventh ACM international conference on Multimedia (Part 1)
On computing correlated aggregates over continual data streams

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Locally adaptive dimensionality reduction for indexing large time series databases

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Hierarchical filtering method for content-based music retrieval via acoustic input

MULTIMEDIA '01 Proceedings of the ninth ACM international conference on Multimedia
Near-optimal sparse fourier representations via sampling

STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Characterizing memory requirements for queries over continuous data streams

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Sampling from a moving window over streaming data

SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Maintaining stream statistics over sliding windows: (extended abstract)

SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Continuously adaptive continuous queries over streams

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Processing complex aggregate queries over data streams

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Efficient Similarity Search In Sequence Databases

FODO '93 Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms
Efficient Retrieval of Similar Time Sequences Under Time Warping

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Fast Time Sequence Indexing for Arbitrary Lp Norms

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Surfing Wavelets on Streams: One-Pass Summaries for Approximate Aggregate Queries

Proceedings of the 27th International Conference on Very Large Data Bases
Approximate join processing over data streams

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Warping indexes with envelope transforms for query by humming

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Chain: operator scheduling for memory minimization in data stream systems

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Processing set expressions over continuous update streams

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Gigascope: a stream database for network applications

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Similarity Search Over Time-Series Data Using Wavelets

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Discovering Similar Multidimensional Trajectories

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Aurora: a new model and architecture for data stream management

The VLDB Journal — The International Journal on Very Large Data Bases
Efficient elastic burst detection in data streams

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Online event-driven subsequence matching over financial data streams

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
FTW: fast similarity search under the time warping distance

Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Robust and fast similarity search for moving object trajectories

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
BRAID: stream mining through group lag correlations

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Streaming pattern discovery in multiple time-series

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Dot Plots for Time Series Analysis

ICTAI '05 Proceedings of the 17th IEEE International Conference on Tools with Artificial Intelligence
Modeling skew in data streams

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Optimal multi-scale patterns in time series streams

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Online summarization of dynamic time series data

The VLDB Journal — The International Journal on Very Large Data Bases
Spatio-temporal data reduction with deterministic error bounds

The VLDB Journal — The International Journal on Very Large Data Bases
A framework for mining evolving trends in web data streams using dynamic learning and retrospective validation

Computer Networks: The International Journal of Computer and Telecommunications Networking - Web dynamics
Comparing data streams using Hamming norms (how to zero in)

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
StatStream: statistical monitoring of thousands of data streams in real time

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Exact indexing of dynamic time warping

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Load shedding in a data stream manager

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Adaptive, hands-off stream mining

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Operator scheduling in a data stream manager

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Remembrance of streams past: overload-sensitive management of archived streams

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
On the marriage of Lp-norms and edit distance

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Ranked subsequence matching in time-series databases

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Approximate embedding-based subsequence matching of time series

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Time-decaying aggregates in out-of-order streams

Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
SOLE: scalable on-line execution of continuous queries on spatio-temporal data streams

The VLDB Journal — The International Journal on Very Large Data Bases
The Rich Transcription 2007 Meeting Recognition Evaluation

Multimodal Technologies for Perception of Humans
Identifying Similar Subsequences in Data Streams

DEXA '08 Proceedings of the 19th international conference on Database and Expert Systems Applications
Querying and mining of time series data: experimental comparison of representations and distance measures

Proceedings of the VLDB Endowment
Efficient Online Subsequence Searching in Data Streams under Dynamic Time Warping Distance

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Similarity Group-By

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
DynaMMo: mining and summarization of coevolving sequences with missing values

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Remote real-time trajectory simplification

PERCOM '09 Proceedings of the 2009 IEEE International Conference on Pervasive Computing and Communications
Managing massive time series streams with multi-scale compressed trickles

Proceedings of the VLDB Endowment
Anticipatory DTW for efficient similarity search in time series databases

Proceedings of the VLDB Endowment
Fast approximate correlation for massive time-series data

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Online discovery and maintenance of time series motifs

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Searching and mining trillions of time series subsequences under dynamic time warping

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Fast mining and forecasting of complex time-stamped events

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining

A similarity-based approach for data stream classification

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Subsequence matching is a basic problem in the field of data stream mining. In recent years, there has been significant research effort spent on efficiently finding subsequences similar to a query sequence. Another challenging issue in relation to subsequence matching is how we identify common local patterns when both sequences are evolving. This problem arises in trend detection, clustering, and outlier detection. Dynamic time warping (DTW) is often used for subsequence matching and is a powerful similarity measure. However, the straightforward method using DTW incurs a high computation cost for this problem. In this paper, we propose a one-pass algorithm, CrossMatch, that achieves the above goal. CrossMatch addresses two important challenges: (1) how can we identify common local patterns efficiently without any omission? (2) how can we find common local patterns in data stream processing? To tackle these challenges, CrossMatch incorporates three ideas: (1) a scoring function, which computes the DTW distance indirectly to reduce the computation cost, (2) a position matrix, which stores starting positions to keep track of common local patterns in a streaming fashion, and (3) a streaming algorithm, which identifies common local patterns efficiently and outputs them on the fly. We provide a theoretical analysis and prove that our algorithm does not sacrifice accuracy. Our experimental evaluation and case studies show that CrossMatch can incrementally discover common local patterns in data streams within constant time (per update) and space.