An efficient algorithm for approximate biased quantile computation in data streams

  • Authors:
  • Qi Zhang;Wei Wang

  • Affiliations:
  • University of North Carolina at Chapel Hill, Chapel Hill, NC;University of North Carolina at Chapel Hill, Chapel Hill, NC

  • Venue:
  • Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose an efficient algorithm for approximate biased quantile computation in large data streams. Our algorithm computes decomposable biased quantile summaries on fixed sized blocks and dynamically maintains the biased quantile summary for the entire stream as the exponential histogram over the block-wise quantile summaries. The algorithm is computationally efficient and achieves an amortized computational cost of O(log(1⁄∈log(∈n))) and a space requirement of O(log3∈n↬∈). Our algorithm does not assume prior knowledge of the stream sizes or the range of data values in the streams. In practice, our algorithm is able to efficiently maintain summaries over large data streams with over tens of millions of observations and achieves significant performance improvement over prior algorithms.