An information statistics approach to data stream and communication complexity

  • Authors:
  • Ziv Bar-Yossef;T. S. Jayram;Ravi Kumar;D. Sivakumar

  • Affiliations:
  • IBM Almaden Research Center, 650 Harry Road, San Jose, CA;IBM Almaden Research Center, 650 Harry Road, San Jose, CA;IBM Almaden Research Center, 650 Harry Road, San Jose, CA;IBM Almaden Research Center, 650 Harry Road, San Jose, CA

  • Venue:
  • Journal of Computer and System Sciences - Special issue on FOCS 2002
  • Year:
  • 2004

Quantified Score

Hi-index 0.06

Visualization

Abstract

We present a new method for proving strong lower bounds in communication complexity. This method is based on the notion of the conditional information complexity, of a function which is the minimum amount of information about the inputs that has to be revealed by a communication protocol for the function. While conditional information complexity is a lower bound on communication complexity, we show that it also admits a direct sum theorem. Direct sum decomposition reduces our task to that of proving conditional information complexity lower bounds for simple problems (such as the AND of two bits). For the latter, we develop novel techniques based on Hellinger distance and its generalizations.Our paradigm leads to two main results: (1) An improved lower bound for the multi-party set-disjointness problem in the general communication complexity model, and a nearly optimal lower bound in the one-way communication model. As a consequence, we show that for any real k 2, approximating the kth frequency moment in the data stream model requires essentially Ω(n1-2/k) space; this resolves a conjecture of Alon et al. (J. Comput. System Sci. 58(1) (1999) 137).(2) A lower bound for the Lp approximation problem in the general communication model: this solves an open problem of Saks and Sun (in: Proceedings of the 34th Annual ACM Symposium on Theory of Computing (STOC), 2002, pp. 360-369). As a consequence, we show that for p 2, approximating the Lp norm to within a factor of nε in the data stream model with constant number of passes requires Ω(n1-4ε-2/p) space.