Approximation algorithms for wavelet transform coding of data streams

  • Authors:
  • Sudipto Guha;Boulos Harb

  • Affiliations:
  • University of Pennsylvania, Philadelphia, PA;University of Pennsylvania, Philadelphia, PA

  • Venue:
  • SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Given a set of orthonormal basis functions {ψi} and a target function/vector f, the wavelet representation problem is to construct f as a combination of at most B basis vectors to minimize some normed distance between f and f. The problem is well understood if the error is the mean squared error: The largest (ignoring signs) B coefficients of the wavelet expansion should be retained. This strategy follows from the proof of optimality and is not a built-in constraint.The mean squared error, however, is not the optimization criterion in several scenarios. The above easy solution to the wavelet representation problem does not carry over to lp for p ≠ 2, and it turns out that restricting the solution to any subset of coefficients of size B or less is suboptimal compared to the best solution which can choose arbitrary real numbers. Further, all the previous literature on non-l2 errors only considered the Haar system.In this paper we provide the first approximation schemes for the unrestricted optimization problem. We provide a lower bounding technique based on a system of inequalities. We show that a modified greedy algorithm that retains the coefficients of expansion gives a O(log n) true (factor) approximation algorithm for a wide variety of compact wavelet systems, including Haar, Daubechies, Symmlets, Coiflets among others. This vindicates several scaling type algorithms which are used in practice. We subsequently augment the lower bound and give a FPTAS for the Haar system. The same ideas extend to a QPTAS for the more general class of compact wavelets mentioned above. We also consider adaptive quantization problems, which are generalizations of the B-term representations.