Lower bounds for quantile estimation in random-order and multi-pass streaming

Authors:
Sudipto Guha;Andrew McGregor
Affiliations:
University of Pennsylvania;University of California, San Diego
Venue:
ICALP'07 Proceedings of the 34th international conference on Automata, Languages and Programming
Year:
2007

Citing 20
Cited 13

Rounds in communication complexity revisited

SIAM Journal on Computing
Communication complexity

Communication complexity
Approximate medians and other quantiles in one pass and with limited memory

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
On data structures and asymmetric communication complexity

Journal of Computer and System Sciences
Random sampling techniques for space efficient online computation of order statistics of large datasets

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Space-efficient online computation of quantile summaries

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Pass efficient algorithms for approximating large matrices

SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Counting inversions in lists

SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Frequency Estimation of Internet Packet Streams with Limited Space

ESA '02 Proceedings of the 10th Annual European Symposium on Algorithms
Medians and beyond: new aggregation techniques for sensor networks

SenSys '04 Proceedings of the 2nd international conference on Embedded networked sensor systems
Multi-pass geometric algorithms

SCG '05 Proceedings of the twenty-first annual symposium on Computational geometry
An improved data stream summary: the count-min sketch and its applications

Journal of Algorithms
Streaming and sublinear approximation of entropy and information distances

SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
The space complexity of pass-efficient algorithms for clustering

SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
On graph problems in a semi-streaming model

Theoretical Computer Science - Automata, languages and programming: Algorithms and complexity (ICALP-A 2004)
Space- and time-efficient deterministic algorithms for biased quantiles over data streams

Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Approximate quantiles and the order of the stream

Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
How to summarize the universe: dynamic maintenance of quantiles

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
On determinism versus non-determinism and related problems

SFCS '83 Proceedings of the 24th Annual Symposium on Foundations of Computer Science
Finding graph matchings in data streams

APPROX'05/RANDOM'05 Proceedings of the 8th international workshop on Approximation, Randomization and Combinatorial Optimization Problems, and Proceedings of the 9th international conference on Randamization and Computation: algorithms and techniques

Tight lower bounds for selection in randomly ordered streams

Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
Declaring independence via the sketching of sketches

Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
Robust lower bounds for communication and stream computation

STOC '08 Proceedings of the fortieth annual ACM symposium on Theory of computing
Estimating PageRank on graph streams

Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Sleeping on the job: energy-efficient and robust broadcast for radio networks

Proceedings of the twenty-seventh ACM symposium on Principles of distributed computing
Comparison-based time-space lower bounds for selection

SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
The average-case complexity of counting distinct elements

Proceedings of the 12th International Conference on Database Theory
Best-Order Streaming Model

TAMC '09 Proceedings of the 6th Annual Conference on Theory and Applications of Models of Computation
Sublinear estimation of entropy and information distances

ACM Transactions on Algorithms (TALG)
Comparison-based time-space lower bounds for selection

ACM Transactions on Algorithms (TALG)
Best-order streaming model

Theoretical Computer Science
Estimating PageRank on graph streams

Journal of the ACM (JACM)
Optimal Collapsing Protocol for Multiparty Pointer Jumping

Theory of Computing Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present lower bounds on the space required to estimate the quantiles of a stream of numerical values. Quantile estimation is perhaps the most studied problem in the data stream model and it is relatively well understood in the basic single-pass data stream model in which the values are ordered adversarially. Natural extensions of this basic model include the random-order model in which the values are ordered randomly (e.g. [21,5,13,11,12]) and the multi-pass model in which an algorithm is permitted a limited number of passes over the stream (e.g. [6,7,1,19,2,6,7,1,19,2]). We present lower bounds that complement existing upper bounds [21,11] in both models. One consequence is an exponential separation between the random-order and adversarial-order models: using Ω(polylog n) space, exact selection requires Ω(log n) passes in the adversarial-order model while O(log log n) passes are sufficient in the random-order model.