A Fast Algorithm for Approximate Quantiles in High Speed Data Streams

  • Authors:
  • Qi Zhang;Wei Wang

  • Affiliations:
  • University of North Carolina, Chapel Hill, USA;University of North Carolina, Chapel Hill, USA

  • Venue:
  • SSDBM '07 Proceedings of the 19th International Conference on Scientific and Statistical Database Management
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a fast algorithm for computing approx- imate quantiles in high speed data streams with deter- ministic error bounds. For data streams of size N where N is unknown in advance, our algorithm par- titions the stream into sub-streams of exponentially increasing size as they arrive. For each sub-stream which has a fixed size, we compute and maintain a multi-level summary structure using a novel algorithm. In order to achieve high speed performance, the algo- rithm uses simple block-wise merge and sample oper- ations. Overall, our algorithms for fixed-size streams and arbitrary-size streams have a computational cost of O(N log( \frac{1} { \in } log \in N)) and an average per-element update cost of O(log logN) if \in is fixed.