Summary cache: a scalable wide-area web cache sharing protocol
IEEE/ACM Transactions on Networking (TON)
Space/time trade-offs in hash coding with allowable errors
Communications of the ACM
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Database System Implementation
Database System Implementation
Maintaining Stream Statistics over Sliding Windows
SIAM Journal on Computing
Maintaining time-decaying stream aggregates
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Issues in data stream management
ACM SIGMOD Record
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Approximate counts and quantiles over sliding windows
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Duplicate detection in click streams
WWW '05 Proceedings of the 14th international conference on World Wide Web
Time-Decaying Bloom Filters for Data Streams with Skewed Distributions
RIDE '05 Proceedings of the 15th International Workshop on Research Issues in Data Engineering: Stream Data Mining and Applications
Approximately detecting duplicates for streaming data using stable bloom filters
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Data streams: algorithms and applications
Foundations and Trends® in Theoretical Computer Science
Time-decaying sketches for sensor data aggregation
Proceedings of the twenty-sixth annual ACM symposium on Principles of distributed computing
Duplicate detection in pay-per-click streams using temporal stateful Bloom filters
International Journal of Data Analysis Techniques and Strategies
Hi-index | 0.00 |
The duplicate-insensitive and time-decayed sum of an arbitrary subset in a stream is an important aggregation for various analyses in many distributed stream scenarios. In general, precisely providing this sum in an unbounded and high-rate stream is infeasible. Therefore, we target at this problem and introduce a sketch, namely, time-decaying Bloom Filter (TDBF). The TDBF can detect duplicates in a stream and meanwhile dynamically maintain decayed-weight of all distinct elements in the stream according to a user-specified decay function. For a query for the current decayed sum of a subset in the stream, TDBF provides an effective estimation. In our theoretical analysis, a provably approximate guarantee has been given for the error of the estimation. In addition, the experimental results on synthetic stream validate our theoretical analysis.