Practical linguistic steganography using contextual synonym substitution and vertex colour coding

Authors:
Ching-Yun Chang;Stephen Clark
Affiliations:
University of Cambridge;University of Cambridge
Venue:
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Year:
2010

Citing 11
Cited 1

Natural language processing for information assurance and security: an overview and implementations

Proceedings of the 2000 workshop on New security paradigms
Hiding the Hidden: A software system for concealing ciphertext as innocuous text

ICICS '97 Proceedings of the First International Conference on Information and Communication Security
Natural Language Watermarking: Design, Analysis, and a Proof-of-Concept Implementation

IHW '01 Proceedings of the 4th International Workshop on Information Hiding
Natural Language Watermarking and Tamperproofing

IH '02 Revised Papers from the 5th International Workshop on Information Hiding
The hiding virtues of ambiguity: quantifiably resilient watermarking of natural language text through synonym substitutions

MM&Sec '06 Proceedings of the 8th workshop on Multimedia and security
Words are not enough: sentence level natural language watermarking

Proceedings of the 4th ACM international workshop on Contents protection and security
SemEval-2007 task 10: English lexical substitution task

SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
Real-word spelling correction using Google Web IT 3-grams

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
Linguistic steganography using automatically generated paraphrases

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
A method of linguistic steganography based on collocationally-verified synonymy

IH'04 Proceedings of the 6th international conference on Information Hiding
A natural language watermarking based on chinese syntax

ICNC'05 Proceedings of the First international conference on Advances in Natural Computation - Volume Part III

Natural language watermarking for german texts

Proceedings of the first ACM workshop on Information hiding and multimedia security

Quantified Score

Hi-index	0.00

Visualization

Abstract

Linguistic Steganography is concerned with hiding information in natural language text. One of the major transformations used in Linguistic Steganography is synonym substitution. However, few existing studies have studied the practical application of this approach. In this paper we propose two improvements to the use of synonym substitution for encoding hidden bits of information. First, we use the Web 1T Google n-gram corpus for checking the applicability of a synonym in context, and we evaluate this method using data from the SemEval lexical substitution task. Second, we address the problem that arises from words with more than one sense, which creates a potential ambiguity in terms of which bits are encoded by a particular word. We develop a novel method in which words are the vertices in a graph, synonyms are linked by edges, and the bits assigned to a word are determined by a vertex colouring algorithm. This method ensures that each word encodes a unique sequence of bits, without cutting out large number of synonyms, and thus maintaining a reasonable embedding capacity.