The space complexity of approximating the frequency moments
STOC '96 Proceedings of the twenty-eighth annual ACM symposium on Theory of computing
Applied numerical linear algebra
Applied numerical linear algebra
Clustering in large graphs and matrices
Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
Fast Monte-Carlo Algorithms for finding low-rank approximations
FOCS '98 Proceedings of the 39th Annual Symposium on Foundations of Computer Science
Distributed deviation detection in sensor networks
ACM SIGMOD Record
Streaming pattern discovery in multiple time-series
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Optimal multi-scale patterns in time series streams
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Sketching asynchronous streams over a sliding window
Proceedings of the twenty-fifth annual ACM symposium on Principles of distributed computing
Efficient Process of Top-k Range-Sum Queries over Multiple Streams with Minimized Global Error
IEEE Transactions on Knowledge and Data Engineering
Boolean representation based data-adaptive correlation analysis over time series streams
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Data stream mining for market-neutral algorithmic trading
Proceedings of the 2008 ACM symposium on Applied computing
Colibri: fast mining of large static and dynamic graphs
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Incremental tensor analysis: Theory and applications
ACM Transactions on Knowledge Discovery from Data (TKDD)
Flexible least squares for temporal data mining and statistical arbitrage
Expert Systems with Applications: An International Journal
Adaptive correlation analysis in stream time series with sliding windows
Computers & Mathematics with Applications
A deterministic algorithm for summarizing asynchronous streams over a sliding window
STACS'07 Proceedings of the 24th annual conference on Theoretical aspects of computer science
Effective Computations on Sliding Windows
SIAM Journal on Computing
Proceedings of the Second International Conference on Innovative Computing and Cloud Computing
Hi-index | 0.00 |
In a variety of modern mining applications, data are commonly viewed as infinite time ordered data streams rather as finite data sets stored on disk. This view challenges fundamental assumptions commonly made in the context of several data mining algorithms.In this paper, we study the problem of identifying correlations between multiple data streams. In particular, we propose algorithms capable of capturing correlations between multiple continuous data streams in a highly efficient and accurate manner. Our algorithms and techniques are applicable in the case of both synchronous and asynchronous data streaming environments. We capture correlations between multiple streams using the well known technique of Singular Value Decomposition (SVD). Correlations between data items, and the SVD technique in particular, have been repeatedly utilized in an off-line (non stream) data mining problems, for example forecasting, approximate query answering, and data reduction.We propose a methodology based on a combination of dimensionality reduction and sampling to make the SVD technique suitable for a data stream context. Our techniques are approximate, trading accuracy with performance, and we analytically quantify this tradeoff. We present a through experimental evaluation, using both real and synthetic data sets, from a prototype implementation of our technique, investigating the impact of various parameters in the accuracy of the overall computation. Our results indicate, that correlations between multiple data streams can be identified very efficiently and accurately. The algorithms proposed herein, are presented as generic tools, with a multitude of applications on data stream mining problems.