Elements of information theory
Elements of information theory
Fast subsequence matching in time-series databases
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Similarity-based queries for time series data
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Efficiently supporting ad hoc queries in large datasets of time sequences
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Efficient search for approximate nearest neighbor in high dimensional spaces
STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
A comparison of DFT and DWT based similarity search in time-series databases
Proceedings of the ninth international conference on Information and knowledge management
Database-friendly random projections
PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Space-efficient online computation of quantile summaries
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Locally adaptive dimensionality reduction for indexing large time series databases
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Dynamic multidimensional histograms
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Efficient Similarity Search In Sequence Databases
FODO '93 Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms
HierarchyScan: A Hierarchical Similarity Search Algorithm for Databases of Long Sequences
ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Similarity Search in High Dimensions via Hashing
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Fast Time Sequence Indexing for Arbitrary Lp Norms
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Identifying Representative Trends in Massive Time Series Data Sets Using Sketches
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Surfing Wavelets on Streams: One-Pass Summaries for Approximate Aggregate Queries
Proceedings of the 27th International Conference on Very Large Data Bases
Stable distributions, pseudorandom generators, embeddings and data stream computation
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Indexing multi-dimensional time-series with support for multiple distance measures
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Online Amnesic Approximation of Streaming Time Series
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
High Performance Discovery In Time Series: Techniques And Case Studies (Monographs in Computer Science)
StatStream: statistical monitoring of thousands of data streams in real time
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Exact indexing of dynamic time warping
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Warping the time on data streams
Data & Knowledge Engineering
Collaborative data gathering in wireless sensor networks using measurement co-occurrence
Computer Communications
Flexible least squares for temporal data mining and statistical arbitrage
Expert Systems with Applications: An International Journal
Managing massive time series streams with multi-scale compressed trickles
Proceedings of the VLDB Endowment
On privacy in time series data mining
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Fast approximate correlation for massive time-series data
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
MG-join: detecting phenomena and their correlation in high dimensional data streams
Distributed and Parallel Databases
A new class of attacks on time series data mining\m{1}
Intelligent Data Analysis
Fast Discovery of Group Lag Correlations in Streams
ACM Transactions on Knowledge Discovery from Data (TKDD)
A review on time series data mining
Engineering Applications of Artificial Intelligence
Preserving Privacy in Time Series Data Mining
International Journal of Data Warehousing and Mining
Efficient sentiment correlation for large-scale demographics
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Discovering longest-lasting correlation in sequence databases
Proceedings of the VLDB Endowment
On clustering large number of data streams
Intelligent Data Analysis
Hi-index | 0.00 |
Data arriving in time order (a data stream) arises in fields including physics, finance, medicine, and music, to name a few. Often the data comes from sensors (in physics and medicine for example) whose data rates continue to improve dramatically as sensor technology improves. Further, the number of sensors is increasing, so correlating data between sensors becomes ever more critical in order to distill knowlege from the data. In many applications such as finance, recent correlations are of far more interest than long-term correlation, so correlation over sliding windows (windowed correlation) is the desired operation. Fast response is desirable in many applications (e.g., to aim a telescope at an activity of interest or to perform a stock trade). These three factors -- data size, windowed correlation, and fast response -- motivate this work.Previous work [10, 14] showed how to compute Pearson correlation using Fast Fourier Transforms and Wavelet transforms, but such techniques don't work for time series in which the energy is spread over many frequency components, thus resembling white noise. For such "uncooperative" time series, this paper shows how to combine several simple techniques -- sketches (random projections), convolution, structured random vectors, grid structures, and combinatorial design -- to achieve high performance windowed Pearson correlation over a variety of data sets.