High-rate random-like spherical fingerprinting codes with linear decoding complexity

  • Authors:
  • Jean-François Jourdas;Pierre Moulin

  • Affiliations:
  • EDS, Paris, France and Beckman Institute and the Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, Urbana, IL;Beckman Institute, Coordinated Science Laboratory and Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, Urbana, IL

  • Venue:
  • IEEE Transactions on Information Forensics and Security - Special issue on electronic voting
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

The rate of a fingerprinting code is defined as R= (1/N) log2M, where N is the code length and M the number of users. Capacity is the supremum of achievable rates for a given class of collusion attacks. Most fingerprinting codes in current literature are algebraic constructions with high minimum distance. These codes have low rate (relative to capacity) and thus long fingerprints for a given number of users and colluders. However, short fingerprints are valuable in media fingerprinting due to the limited number of robust features available for embedding. This paper proposes a framework to build high-rate fingerprinting codes operating near the fundamental capacity limit by concatenating short, random, and statistically independent subcodes. A practical implementation based on the turbo code construction is presented. Each subcode is decoded by a list Viterbi decoding algorithm, which outputs a list of suspect users. These lists are then processed using a matched filter, which extracts the most suspect user and declares him or her guilty. We provide examples of codes that are short, accommodate millions of users, and withstand (with an error probability of the order of 1%) dozens of colluders against the averaging or interleaving attack followed by additive white Gaussian noise. Our fingerprinting codes operate reliably at rates within 30% to 50% of capacity, which are substantially higher than any other existing code. The decoding complexity is linear in N, or, equivalently, in log M.