Axiomatic derivation of the principle of maximum entropy and the principle of minimum cross-entropy

  • Authors:
  • J. Shore;R. Johnson

  • Affiliations:
  • -;-

  • Venue:
  • IEEE Transactions on Information Theory
  • Year:
  • 2006

Quantified Score

Hi-index 754.84

Visualization

Abstract

Jaynes's principle of maximum entropy and Kullbacks principle of minimum cross-entropy (minimum directed divergence) are shown to be uniquely correct methods for inductive inference when new information is given in the form of expected values. Previous justifications use intuitive arguments and rely on the properties of entropy and cross-entropy as information measures. The approach here assumes that reasonable methods of inductive inference should lead to consistent results when there are different ways of taking the same information into account (for example, in different coordinate system). This requirement is formalized as four consistency axioms. These are stated in terms of an abstract information operator and make no reference to information measures. It is proved that the principle of maximum entropy is correct in the following sense: maximizing any function but entropy will lead to inconsistency unless that function and entropy have identical maxima. In other words given information in the form of constraints on expected values, there is only one (distribution satisfying the constraints that can be chosen by a procedure that satisfies the consistency axioms; this unique distribution can be obtained by maximizing entropy. This result is established both directly and as a special case (uniform priors) of an analogous result for the principle of minimum cross-entropy. Results are obtained both for continuous probability densities and for discrete distributions.