Practical linguistic steganography using contextual synonym substitution and vertex colour coding

  • Authors:
  • Ching-Yun Chang;Stephen Clark

  • Affiliations:
  • University of Cambridge;University of Cambridge

  • Venue:
  • EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Linguistic Steganography is concerned with hiding information in natural language text. One of the major transformations used in Linguistic Steganography is synonym substitution. However, few existing studies have studied the practical application of this approach. In this paper we propose two improvements to the use of synonym substitution for encoding hidden bits of information. First, we use the Web 1T Google n-gram corpus for checking the applicability of a synonym in context, and we evaluate this method using data from the SemEval lexical substitution task. Second, we address the problem that arises from words with more than one sense, which creates a potential ambiguity in terms of which bits are encoded by a particular word. We develop a novel method in which words are the vertices in a graph, synonyms are linked by edges, and the bits assigned to a word are determined by a vertex colouring algorithm. This method ensures that each word encodes a unique sequence of bits, without cutting out large number of synonyms, and thus maintaining a reasonable embedding capacity.