Maintaining Stream Statistics over Sliding Windows
SIAM Journal on Computing
Frequency Estimation of Internet Packet Streams with Limited Space
ESA '02 Proceedings of the 10th Annual European Symposium on Algorithms
A simple algorithm for finding frequent elements in streams and bags
ACM Transactions on Database Systems (TODS)
Medians and beyond: new aggregation techniques for sensor networks
SenSys '04 Proceedings of the 2nd international conference on Embedded networked sensor systems
Approximate counts and quantiles over sliding windows
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Maintaining significant stream statistics over sliding windows
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
A simpler and more efficient deterministic scheme for finding frequent items over sliding windows
Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Sketching asynchronous streams over a sliding window
Proceedings of the twenty-fifth annual ACM symposium on Principles of distributed computing
Time-decaying aggregates in out-of-order streams
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
A deterministic algorithm for summarizing asynchronous streams over a sliding window
STACS'07 Proceedings of the 24th annual conference on Theoretical aspects of computer science
Hi-index | 0.00 |
In an asynchronous data stream, the data items may be out of order with respect to their original timestamps. This paper gives a space-efficient data structure to maintain such a data stream so that it can approximate the frequent item set over a sliding time window with sufficient accuracy. Prior to our work, Cormode et al. [3] have the best solution, with space complexity $O(\frac{1}{\varepsilon} \log W \log (\frac{\varepsilon B}{\log W}) \min\{\log W, \frac{1}{\varepsilon}\}\log U)$, where ε is the given error bound, W and B are parameters of the sliding window, and U is the number of all possible item names. Our solution reduces the space to $O(\frac{1}{\varepsilon} \log W \log (\frac{\varepsilon B}{\log W}))$. We also unify the study of synchronous and asynchronous data stream by quantifying the delay of the data items. When the delay is zero, our solution matches the space complexity of the best solution to the synchronous data streams [8].