Maintaining stream statistics over sliding windows: (extended abstract)
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Finding Frequent Items in Data Streams
ICALP '02 Proceedings of the 29th International Colloquium on Automata, Languages and Programming
Frequency Estimation of Internet Packet Streams with Limited Space
ESA '02 Proceedings of the 10th Annual European Symposium on Algorithms
A simple algorithm for finding frequent elements in streams and bags
ACM Transactions on Database Systems (TODS)
Identifying frequent items in sliding windows over on-line packet streams
Proceedings of the 3rd ACM SIGCOMM conference on Internet measurement
Approximate counts and quantiles over sliding windows
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
An improved data stream summary: the count-min sketch and its applications
Journal of Algorithms
Data streams: algorithms and applications
Foundations and Trends® in Theoretical Computer Science
Variance estimation over sliding windows
Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Cost-aware WWW proxy caching algorithms
USITS'97 Proceedings of the USENIX Symposium on Internet Technologies and Systems on USENIX Symposium on Internet Technologies and Systems
Time-decaying sketches for sensor data aggregation
Proceedings of the twenty-sixth annual ACM symposium on Principles of distributed computing
Approximate frequency counts over data streams
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Improved Algorithms for Polynomial-Time Decay and Time-Decay with Additive Error
Theory of Computing Systems
Time-decaying aggregates in out-of-order streams
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Maintaining time-decaying stream aggregates
Journal of Algorithms
Hi-index | 5.23 |
We consider the problem of estimating the frequency count of data stream elements under polynomial decay functions. In these settings every element in the stream is assigned with a time-decreasing weight, using a non-increasing polynomial function. Decay functions are used in applications where older data is less significant, less interesting or even less reliable than recent data. Consider a data stream of N elements drawn from a universe U. We propose three poly-logarithmic algorithms for the problem. The first one, deterministic, uses O(1@e^2logN(loglogN+logU)) bits, where @e@?(0,1) is the approximation parameter. The second one, probabilistic, uses O(1@e^2logN@dlog1@e) bits or O(1@e^2logN@dlogN) bits, depending on the decay function parameter, where @d@?(0,1) is the probability of failure. The third one, deterministic in the stochastic model, uses O(1@elogU) bits or O(1@e^2logN) bits, also depending on the decay parameter as will be described in this paper. This variant of the problem is important and has many applications. To our knowledge, it has never been studied before.