Information-theoretic analysis of information hiding

  • Authors:
  • P. Moulin;J. A. O'Sullivan

  • Affiliations:
  • Beckman Inst., Univ. of Illinois at Urbana-Champaign, Urbana, IL, USA;-

  • Venue:
  • IEEE Transactions on Information Theory
  • Year:
  • 2006

Quantified Score

Hi-index 755.20

Visualization

Abstract

An information-theoretic analysis of information hiding is presented, forming the theoretical basis for design of information-hiding systems. Information hiding is an emerging research area which encompasses applications such as copyright protection for digital media, watermarking, fingerprinting, steganography, and data embedding. In these applications, information is hidden within a host data set and is to be reliably communicated to a receiver. The host data set is intentionally corrupted, but in a covert way, designed to be imperceptible to a casual analysis. Next, an attacker may seek to destroy this hidden information, and for this purpose, introduce additional distortion to the data set. Side information (in the form of cryptographic keys and/or information about the host signal) may be available to the information hider and to the decoder. We formalize these notions and evaluate the hiding capacity, which upper-bounds the rates of reliable transmission and quantifies the fundamental tradeoff between three quantities: the achievable information-hiding rates and the allowed distortion levels for the information hider and the attacker. The hiding capacity is the value of a game between the information hider and the attacker. The optimal attack strategy is the solution of a particular rate-distortion problem, and the optimal hiding strategy is the solution to a channel-coding problem. The hiding capacity is derived by extending the Gel'fand-Pinsker (1980) theory of communication with side information at the encoder. The extensions include the presence of distortion constraints, side information at the decoder, and unknown communication channel. Explicit formulas for capacity are given in several cases, including Bernoulli and Gaussian problems, as well as the important special case of small distortions. In some cases, including the last two above, the hiding capacity is the same whether or not the decoder knows the host data set. It is shown that many existing information-hiding systems in the literature operate far below capacity.