Compressed counting

  • Authors:
  • Ping Li

  • Affiliations:
  • Cornell University, Ithaca NY

  • Venue:
  • SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
  • Year:
  • 2009

Quantified Score

Hi-index 0.01

Visualization

Abstract

We propose Compressed Counting (CC) for approximating the αth frequency moments (0 relaxed strict-Turnstile model, using maximally-skewed stable random projections. Estimators based on the geometric mean and the harmonic mean are developed. When α = 1, a simple counter suffices for counting the first moment (i.e., sum). The geometric mean estimator of CC has asymptotic variance α Δ = |α - 1|, capturing the intuition that the complexity should decrease as Δ = |α - 1| → 0. However, the previous classical algorithms based on symmetric stable random projections[12, 15] required O (1/ε2) space, in order to approximate the αth moments within a 1 + ε factor, for any 0 We show that using the geometric mean estimator, CC requires O [EQUATION] space, as Δ → 0. Therefore, in the neighborhood of α = 1, the complexity of CC is essentially O (1/ε) instead of O (1/ε2). CC may be useful for estimating Shannon entropy, which can be approximated by certain functions of the αth moments with α → 1. [10, 9] suggested using α = 1 + Δ with (e.g.,) Δ −7, to rigorously ensure reasonable approximations. Thus, unfortunately, CC is "theoretically impractical" for estimating Shannon entropy, despite its empirical success reported in [16].