How to Achieve Minimax Expected Kullback-Leibler Distance from an Unknown Finite Distribution

  • Authors:
  • Dietrich Braess;Jürgen Forster;Tomas Sauer;Hans-Ulrich Simon

  • Affiliations:
  • -;-;-;-

  • Venue:
  • ALT '02 Proceedings of the 13th International Conference on Algorithmic Learning Theory
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider a problem that is related to the "Universal Encoding Problem" from information theory. The basic goal is to find rules that map "partial information" about a distribution X over an m-letter alphabet into a guess X for X such that the Kullback-Leibler divergence between X and X is as small as possible. The cost associated with a rule is the maximal expected Kullback-Leibler divergence between X and X. First, we show that the cost associated with the well-known add-one rule equals ln(1+(m-1)/(n+1)) thereby extending a result of Forster and Warmuth [3,2] to m 驴 3. Second, we derive an absolute (as opposed to asymptotic) lower bound on the smallest possible cost. Technically, this is done by determining (almost exactly) the Bayes error of the add-one rule with a uniform prior (where the asymptotics for n 驴 驴 was known before). Third, we hint to tools from approximation theory and support the conjecture that there exists a rule whose cost asymptotically matches the theoretical barrier from the lower bound.