Detecting the Theft of Natural Language Text Using Birthmark

  • Authors:
  • Jianlong Yang;Jianmin Wang;Deyi Li

  • Affiliations:
  • Tsinghua University, China;Tsinghua University, China;Tsinghua University, China

  • Venue:
  • IIH-MSP '06 Proceedings of the 2006 International Conference on Intelligent Information Hiding and Multimedia
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

To detect the theft of natural language text effectively, we present a novel scheme to derive birthmark from the text. Since birthmark is a unique and native characteristic of every text, a text with the same birthmark of another can be easily suspected of a copy. Ideally, birthmark should satisfy two properties: (a) credibility - independent texts must be distinguished by completely different birthmarks, and (b) resilience - birthmark should be tolerant against meaningpreserving attacks. To evaluate the effectiveness of the proposed birthmark, we conduct two experiments. The first one shows that birthmark successfully distinguishes non-copied files. In the second one, it shows that birthmark has quite good a tolerance against meaning-preserving attacks.