The interplay between entropy and variational distance

Authors:
Siu-Wai Ho;Raymond W. Yeung
Affiliations:
Institute for Telecommunications Research, University of South Australia, Adelaide, SA, Australia;Department of Information Engineering, The Chinese University of Hong Kong, N.T., Hong Kong
Venue:
IEEE Transactions on Information Theory
Year:
2010

Citing 12
Cited 2

Convergence properties of functional estimates for discrete distributions

Random Structures & Algorithms - Special issue on analysis of algorithms dedicated to Don Knuth on the occasion of his (100)8th birthday
Information Theory: Coding Theorems for Discrete Memoryless Systems

Information Theory: Coding Theorems for Discrete Memoryless Systems
Estimation of entropy and mutual information

Neural Computation
Estimating Entropy Rates with Bayesian Confidence Intervals

Neural Computation
Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing)

Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing)
Information Theory and Network Coding

Information Theory and Network Coding
On the discontinuity of the Shannon information measures

IEEE Transactions on Information Theory
On information divergence measures and a unified typicality

IEEE Transactions on Information Theory
Simulation of random processes and rate-distortion theory

IEEE Transactions on Information Theory
Universal entropy estimation via block sorting

IEEE Transactions on Information Theory
Universal discrete denoising: known channel

IEEE Transactions on Information Theory
Estimating Mutual Information Via Kolmogorov Distance

IEEE Transactions on Information Theory

On information divergence measures and a unified typicality

IEEE Transactions on Information Theory
Some properties of Rényi entropy over countably infinite alphabets

Problems of Information Transmission

Quantified Score

Hi-index	754.90

Visualization

Abstract

The relation between the Shannon entropy and variational distance, two fundamental and frequently-used quantities in information theory, is studied in this paper by means of certain bounds on the entropy difference between two probability distributions in terms of the variational distance between them and their alphabet sizes. We also show how to find the distribution achieving the minimum (or maximum) entropy among those distributions within a given variational distance from any given distribution. These results are applied to solve a number of problems that are of fundamental interest. For entropy estimation, we obtain an analytic formula for the confidence interval, solving a problem that has been opened for more than 30 years. For approximation of probability distributions, we find the minimum entropy difference between two distributions in terms of their alphabet sizes and the variational distance between them. In particular, we show that the entropy difference between two distributions that are close in variational distance can be arbitrarily large if the alphabet sizes of the two distributions are unconstrained. For random number generation, we characterize the tradeoff between the amount of randomness required and the distortion in terms of variation distance. New tools for non-convex optimization have been developed to establish the results in this paper.