Tight Lower Bounds for the Distinct Elements Problem

  • Authors:
  • Piotr Indyk;David Woodruff

  • Affiliations:
  • -;-

  • Venue:
  • FOCS '03 Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

We prove strong lower bounds for the space complexity of (\varepsilon ,\delta )-approximating the number of distinct elements F0 in a data stream. Let m be the size of the universe from which the stream elements are drawn. We show that any one-pass streaming algorithm for (\varepsilon ,\delta )-approximating F0 must use \Omega (\frac{1}{{\varepsilon ^2 }}) space when \varepsilon= \Omega (m^{ - \frac{1}{{9 + k}}} ), for any k 0, improving upon the known lower bound of \Omega (\frac{1}{\varepsilon }) for this range of \varepsilon. This lower bound is tight up to a factor of log log m for small \varepsilon and log (\frac{1}{\varepsilon }) for large \varepsilon. Our lower bound is derived from a reduction from the one-way communication complexity of approximating a boolean function in Euclidean space. The reduction makes use of a low-distortion embedding from an \iota _2 to an \iota _1 norm.