Identification entropy

Authors:
R. Ahlswede
Affiliations:
Fakultät für Mathematik, Universität Bielefeld, Bielefeld, Germany
Venue:
General Theory of Information Transfer and Combinatorics
Year:
2006

Citing 1
Cited 5

General theory of information transfer: Updated

Discrete Applied Mathematics

On truth, belief and knowledge

ISIT'09 Proceedings of the 2009 IEEE international conference on Symposium on Information Theory - Volume 1
Malleable coding with edit-distance cost

ISIT'09 Proceedings of the 2009 IEEE international conference on Symposium on Information Theory - Volume 1
Two new results for identification for sources

Information Theory, Combinatorics, and Search Theory
L-identification for uniformly distributed sources and the q-ary identification entropy of second order

Information Theory, Combinatorics, and Search Theory
Bibliography of publications by Rudolf Ahlswede

Information Theory, Combinatorics, and Search Theory

Quantified Score

Hi-index	0.00

Visualization

Abstract

Shannon (1948) has shown that a source $({\mathcal {U}},P,U)$ with output U satisfying Prob (U=u)=Pu, can be encoded in a prefix code ${\mathcal{C}}=\{c_u:u\in{\mathcal {U}}\}\subset\{0,1\}^*$ such that for the entropy $ H(P)=\sum\limits_{u\in{\mathcal {U}}}-p_u\log p_u\leq\sum p_u|| c_u|| \leq H(P)+1,$ where || cu|| is the length of cu. We use a prefix code $\mathcal{C}$ for another purpose, namely noiseless identification, that is every user who wants to know whether a u$(u\in{\mathcal {U}})$ of his interest is the actual source output or not can consider the RV C with $C=c_u=(c_{u_1},\dots,c_{u || c_u ||})$ and check whether C=(C1,C2,...) coincides with cu in the first, second etc. letter and stop when the first different letter occurs or when C=cu. Let $L_{\mathcal{C}}(P,u)$ be the expected number of checkings, if code $\mathcal{C}$ is used. Our discovery is an identification entropy, namely the function $H_I(P)=2\left(1-\sum\limits_{u\in{\mathcal {U}}}P_u^2\right).$ We prove that $L_{\mathcal{C}}(P,P)=\sum\limits_{u\in{\mathcal {U}}}P_u$$L_{\mathcal{C}}(P,u)\geq H_I(P)$ and thus also that $ L(P)=\min\limits_{\mathcal{C}}\max\limits_{u\in{\mathcal {U}}}L_{\mathcal{C}}(P,u)\geq H_I(P)$ and related upper bounds, which demonstrate the operational significance of identification entropy in noiseless source coding similar as Shannon entropy does in noiseless data compression. Also other averages such as $\bar L_{\mathcal{C}}(P)=\frac1{|{\mathcal {U}}|} \sum\limits_{u\in{\mathcal {U}}}L_{\mathcal{C}}(P,u)$ are discussed in particular for Huffman codes where classically equivalent Huffman codes may now be different. We also show that prefix codes, where the codewords correspond to the leaves in a regular binary tree, are universally good for this average.