Introduction to algorithms
The space complexity of approximating the frequency moments
Journal of Computer and System Sciences
On the Average Number of Maxima in a Set of Vectors and Applications
Journal of the ACM (JACM)
A unifying look at data structures
Communications of the ACM
Estimating simple functions on the union of data streams
Proceedings of the thirteenth annual ACM symposium on Parallel algorithms and architectures
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Distributed streams algorithms for sliding windows
Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures
Maintaining Stream Statistics over Sliding Windows
SIAM Journal on Computing
Finding Interesting Associations without Support Pruning
IEEE Transactions on Knowledge and Data Engineering
Data streams: algorithms and applications
SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Counting Distinct Elements in a Data Stream
RANDOM '02 Proceedings of the 6th International Workshop on Randomization and Approximation Techniques
Estimating Rarity and Similarity over Data Stream Windows
ESA '02 Proceedings of the 10th Annual European Symposium on Algorithms
Maintaining time-decaying stream aggregates
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Maintaining variance and k-medians over data stream windows
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Continuously Maintaining Quantile Summaries of the Most Recent N Elements over a Data Stream
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Hourly analysis of a very large topically categorized web query log
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Effective Computation of Biased Quantiles over Data Streams
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Stabbing the Sky: Efficient Skyline Computation over Sliding Windows
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Approximate counts and quantiles over sliding windows
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Discovering evolutionary theme patterns from text: an exploration of temporal text mining
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Space- and time-efficient deterministic algorithms for biased quantiles over data streams
Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
A framework for clustering evolving data streams
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Resource sharing in continuous sliding-window aggregates
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
CUDA-Enabled Optimisation of Technical Analysis Parameters
DS-RT '12 Proceedings of the 2012 IEEE/ACM 16th International Symposium on Distributed Simulation and Real Time Applications
Hi-index | 0.00 |
In this article, we propose a new multiscale sliding window model which differentiates data items in different time periods of the data stream, based on a reasonable monotonicity of resolution assumption. Our model, as a well-motivated extension of the sliding window model, stands halfway between the traditional all-history and time-decaying models. We also present algorithms for estimating two significant data stream statistics---F0 and Jacard's similarity coefficient---with reasonable accuracies under the new model. Our algorithms use space logarithmic in the data stream size and linear in the number of windows; they support update time logarithmic in the number of windows and independent of the accuracy required. Our algorithms are easy to implement. Experimental results demonstrate the efficiencies of our algorithms. Our techniques apply to scenarios in which universe sampling is used.