An Approximate L1-Difference Algorithm for Massive Data Streams

  • Authors:
  • J. Feigenbaum;S. Kannan;M. Strauss;M. Viswanathan

  • Affiliations:
  • -;-;-;-

  • Venue:
  • FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

We give a space-efficient, one-pass algorithm for approximating the L1 difference \math between two functions, when the function values ai and bi are given as data streams, and their order is chosen by an adversary. Our main technical innovation is a method of constructing families {Vj} of limited-independence random variables that are /range-summable/, by which we mean that the \math for \math is computable in time polylog(c), for all seeds s. These random-variable families may be of interest outside our current application domain, i.e., massive data streams generated by communication networks. Our L1-difference algorithm can be viewed as a ``sketching'' algorithm, in the sense of [Broder, Charikar, Frieze, and Mitzenmacher, STOC '98, pp. 327-336], and our algorithm performs better than that of Broder et al. when used to approximate the symmetric difference of two sets with small symmetric difference.