The Data Stream Space Complexity of Cascaded Norms

  • Authors:
  • T. S. Jayram;David P. Woodruff

  • Affiliations:
  • -;-

  • Venue:
  • FOCS '09 Proceedings of the 2009 50th Annual IEEE Symposium on Foundations of Computer Science
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider the problem of estimating cascaded aggregates over a matrix presented as a sequence of updates in a data stream. A cascaded aggregate P 卤Q is defined by evaluating aggregate Q repeatedly over each row of the matrix, and then evaluating aggregate P over the resulting vector of values. This problem was introduced by Cormode andMuthukrishnan, PODS, 2005 [CM]. We analyze the space complexity of estimating cascaded norms on an n 拢d matrix to within a small relative error. Let Lp denote the p-th norm, where p is a non-negative integer. We abbreviate the cascaded normLk 卤Lp by Lk,p . (1) For any constant k 赂 p 赂 2, we obtain a 1-pass e O(n1隆2/kd1隆2/p )-space algorithm for estimating Lk,p . This is optimal up to polylogarithmic factors and resolves an open question of [CM] regarding the space complexity of L4,2. We also obtain 1-pass space-optimal algorithms for estimating L1,k and Lk,1. (2)We prove a space lower bound of (n1隆1/k ) on estimating Lk,0 and Lk,1, resolving an open question due to Indyk, IITK Data StreamsWorkshop (Problem 8), 2006. We also resolve two more questions of [CM] concerning Lk,2 estimation and block heavy hitter problems. Ganguly, Bansal and Dube (FAW, 2008) claimed an e O(1)-space algorithm for estimating Lk,p for any k,p 2 [0,2]. Our lower bounds show this claimis incorrect.