Natural language watermarking for german texts

Authors:
Oren Halvani;Martin Steinebach;Patrick Wolf;Ralf Zimmermann
Affiliations:
Fraunhofer Institute for Secure Information Technology SIT, Darmstadt, Germany;Fraunhofer Institute for Secure Information Technology SIT, Darmstadt, Germany;Fraunhofer Institute for Secure Information Technology SIT, Darmstadt, Germany;Fraunhofer Institute for Secure Information Technology SIT, Darmstadt, Germany
Venue:
Proceedings of the first ACM workshop on Information hiding and multimedia security
Year:
2013

Citing 9
Cited 0

Efficient string matching: an aid to bibliographic search

Communications of the ACM
Natural Language Watermarking: Design, Analysis, and a Proof-of-Concept Implementation

IHW '01 Proceedings of the 4th International Workshop on Information Hiding
Optimization, maxent models, and conditional estimation without magic

NAACL-Tutorials '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: Tutorials - Volume 5
The hiding virtues of ambiguity: quantifiably resilient watermarking of natural language text through synonym substitutions

MM&Sec '06 Proceedings of the 8th workshop on Multimedia and security
Words are not enough: sentence level natural language watermarking

Proceedings of the 4th ACM international workshop on Contents protection and security
A Novel Scheme for Watermarking Natural Language Text

IIH-MSP '07 Proceedings of the Third International Conference on International Information Hiding and Multimedia Signal Processing (IIH-MSP 2007) - Volume 02
Text watermarking by syntactic analysis

ICCOMP'08 Proceedings of the 12th WSEAS international conference on Computers
Practical linguistic steganography using contextual synonym substitution and vertex colour coding

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Part-of-speech tagging from 97% to 100%: is it time for some linguistics?

CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we present four informed natural language watermark embedding methods, which operate on the lexical and syntactic layer of German texts. Our scheme provides several benefits in comparison to state-of-the-art approaches, as for instance that it is not relying on complex NLP operations like full sentence parsing, word sense disambiguation, named entity recognition or semantic role parsing. Even rich lexical resources (e.g. WordNet or the Collins thesaurus), which play an essential role in many previous approches, are unnecessary for our system. Instead, our methods require only a Part-Of-Speech Tagger, simple wordlists that act as black- and whitelists and a trained classifier, which automatically predicts the ability of potential lexical or syntactic patterns to carry portions of the watermark message. Besides this, a part of the proposed methods can be easily adapted into other Indo-European languages, since the grammar rules the methods rely on are not restricted only to the German language. Because the methods perform only lexical and minor syntactic transformations, the watermarked text is not affected by grammatical distortion and simultaneously the meaning of the text is preserved in 82.14% of the cases.