Approximating frequent items in asynchronous data stream over a sliding window

  • Authors:
  • Ho-Leung Chan;Tak-Wah Lam;Lap-Kei Lee;Hing-Fung Ting

  • Affiliations:
  • Department of Computer Science, University of Hong Kong;Department of Computer Science, University of Hong Kong;Department of Computer Science, University of Hong Kong;Department of Computer Science, University of Hong Kong

  • Venue:
  • WAOA'09 Proceedings of the 7th international conference on Approximation and Online Algorithms
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In an asynchronous data stream, the data items may be out of order with respect to their original timestamps. This paper gives a space-efficient data structure to maintain such a data stream so that it can approximate the frequent item set over a sliding time window with sufficient accuracy. Prior to our work, Cormode et al. [3] have the best solution, with space complexity $O(\frac{1}{\varepsilon} \log W \log (\frac{\varepsilon B}{\log W}) \min\{\log W, \frac{1}{\varepsilon}\}\log U)$, where ε is the given error bound, W and B are parameters of the sliding window, and U is the number of all possible item names. Our solution reduces the space to $O(\frac{1}{\varepsilon} \log W \log (\frac{\varepsilon B}{\log W}))$. We also unify the study of synchronous and asynchronous data stream by quantifying the delay of the data items. When the delay is zero, our solution matches the space complexity of the best solution to the synchronous data streams [8].