Probabilistic counting algorithms for data base applications
Journal of Computer and System Sciences
The space complexity of approximating the frequency moments
STOC '96 Proceedings of the twenty-eighth annual ACM symposium on Theory of computing
Min-wise independent permutations (extended abstract)
STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
A small approximately min-wise independent family of hash functions
Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
Estimating simple functions on the union of data streams
Proceedings of the thirteenth annual ACM symposium on Parallel algorithms and architectures
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Maintaining stream statistics over sliding windows: (extended abstract)
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Distributed streams algorithms for sliding windows
Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures
Processing set expressions over continuous update streams
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
StatStream: statistical monitoring of thousands of data streams in real time
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Distributed set-expression cardinality estimation
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Hi-index | 0.00 |
Recently, lots of work focus on devising one-pass algorithms for processing and querying multiple data streams, such as network monitoring, sensor networks, .etc. Estimating the cardinality of set expressions over streams is perhaps one of the most fundamental problems. Unfortunately, no solution has been devised for this issue over sliding windows. In this paper, we propose a space-efficient algorithmic solution to estimate the cardinality of set expression over sliding windows. Our probabilistic method is based on a new hash based synopsis, termed improved 2-level hash sketch. A thorough experimental evaluation has demonstrated that our methods can solve the problem efficiently.