Compression of graphical structures

Authors:
Yongwook Choi;Wojciech Szpankowski
Affiliations:
Department of Computer Science, Purdue University, W. Lafayette, IN;Department of Computer Science, Purdue University, W. Lafayette, IN
Venue:
ISIT'09 Proceedings of the 2009 IEEE international conference on Symposium on Information Theory - Volume 1
Year:
2009

Citing 10
Cited 2

Succinct representation of general unlabeled graphs

Discrete Applied Mathematics
Elements of information theory

Elements of information theory
Asymptotic behavior of the Lempel-Ziv parsing scheme and digital search trees

Theoretical Computer Science - Special volume on mathematical analysis of algorithms (dedicated to D. E. Knuth)
Average Case Analysis of Algorithms on Sequences

Average Case Analysis of Algorithms on Sequences
Three great challenges for half-century-old computer science

Journal of the ACM (JACM)
On the asymmetry of random regular graphs and random graphs

Random Structures & Algorithms - Special issue: Proceedings of the tenth international conference "Random structures and algorithms"
Precise Average Redundancy Of An Idealized Arithmetic Coding

DCC '02 Proceedings of the Data Compression Conference
Towards Compressing Web Graphs

DCC '01 Proceedings of the Data Compression Conference
Structure induction by lossless graph compression

DCC '07 Proceedings of the 2007 Data Compression Conference
Compression of words over a partially commutative alphabet

IEEE Transactions on Information Theory

Structural complexity of random binary trees

ISIT'09 Proceedings of the 2009 IEEE international conference on Symposium on Information Theory - Volume 1
Multiscale approach for the network compression-friendly ordering

Journal of Discrete Algorithms

Quantified Score

Hi-index	0.00

Visualization

Abstract

F. Brooks argues in [3] there is "no theory that gives us a metric for information embodied in structure." Shannon himself alluded to it fifty years earlier in his little known 1953 paper [14]. Indeed, in the past information theory dealt mostly with "conventional data," be it textual data, image, or video data. However, databases of various sorts have come into existence in recent years for storing "unconventional data" including biological data, web data, topographical maps, and medical data. In compressing such data structures, one must consider two types of information: the information conveyed by the structure itself, and the information conveyed by the data labels implanted in the structure. In this paper, we attempt to address the former problem by studying information of graphical structures (i.e., unlabeled graphs). In particular, we consider the Erdös-Rényi graphs G(n, p) over n vertices in which edges are added randomly with probability p. We prove that the structural entropy of G(n, p) is (2n)h(p) - log n! + o(1) = (2n)h(p) - n log n + O(n), where h(p) = -p log p - (1 - p) log (1 - p) is the entropy rate of a conventional memoryless binary source. Then, we design a two-stage encoding that optimally compresses unlabeled graphs up to the first two leading terms of the structural entropy.