Generalized Shannon Code Minimizes the Maximal Redundancy

Authors:
Michael Drmota;Wojciech Szpankowski
Affiliations:
-;-
Venue:
LATIN '02 Proceedings of the 5th Latin American Symposium on Theoretical Informatics
Year:
2002

Citing 9
Cited 6

Dynamic Huffman coding

Journal of Algorithms
Elements of information theory

Elements of information theory
Average Case Analysis of Algorithms on Sequences

Average Case Analysis of Algorithms on Sequences
Code and Parse Trees for Lossless Source Encoding

SEQUENCES '97 Proceedings of the Compression and Complexity of Sequences 1997
Fisher information and stochastic complexity

IEEE Transactions on Information Theory
Minimax redundancy for the class of memoryless sources

IEEE Transactions on Information Theory
The minimum description length principle in coding and modeling

IEEE Transactions on Information Theory
Asymptotic minimax regret for data compression, gambling, and prediction

IEEE Transactions on Information Theory
Asymptotic average redundancy of Huffman (and other) block codes

IEEE Transactions on Information Theory

Precise Average Redundancy Of An Idealized Arithmetic Coding

DCC '02 Proceedings of the Data Compression Conference
Dynamic Shannon coding

Information Processing Letters
A New Algorithm for Building Alphabetic Minimax Trees

Fundamenta Informaticae - Special Issue on Stringology
A New Algorithm for Building Alphabetic Minimax Trees

Fundamenta Informaticae - Special Issue on Stringology
Minimax trees in linear time with applications

European Journal of Combinatorics
Online binary minimax trees

Discrete Applied Mathematics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Source coding, also known as data compression, is an area of information theory that deals with the design and performance evaluation of optimal codes for data compression. In 1952 Huffman constructed his optimal code that minimizes the average code length among all prefix codes for known sources. Actually, Huffman codes minimizes the average redundancy defined as the difference between the code length and the entropy of the source. Interestingly enough, no optimal code is known for other popular optimization criterion such as the maximal redundancy defined as the maximum of the pointwise redundancy over all source sequences. We first prove that a generalized Shannon code minimizes the maximal redundancy among all prefix codes, and present an efficient implementation of the optimal code. Then we compute precisely its redundancy for memoryless sources. Finally, we study universal codes for unknown source distributions. We adopt the minimax approach and search for the best code for the worst source. We establish that such redundancy is a sum of the likelihood estimator and the redundancy of the generalize code computed for the maximum likelihood distribution. This replaces Shtarkov's bound by an exact formula. We also compute precisely the maximal minimax redundancy for a class of memoryless sources. The main findings of this paper are established by techniques that belong to the toolkit of the "analytic analysis of algorithms" such as theory of distribution of sequences modulo 1 and Fourier series. These methods have already found applications in other problems of information theory, and they constitute the so called analytic information theory.