Sketching information divergences

  • Authors:
  • Sudipto Guha;Piotr Indyk;Andrew McGregor

  • Affiliations:
  • University of Pennsylvania;Massachusetts Institute of Technology;University of California, San Diego

  • Venue:
  • COLT'07 Proceedings of the 20th annual conference on Learning theory
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

When comparing discrete probability distributions, natural measures of similarity are not lp distances but rather are information-divergences such as Kullback-Leibler and Hellinger. This paper considers some of the issues related to constructing small-space sketches of distributions, a concept related to dimensionality-reduction, such that these measures can be approximately computed from the sketches. Related problems for lp distances are reasonably well understood via a series of results including Johnson, Lindenstrauss [27,18], Alon, Matias, Szegedy [1], Indyk [24], and Brinkman, Charikar [8]. In contrast, almost no analogous results are known to date about constructing sketches for the information-divergences used in statistics and learning theory.