Linguistic Steganography Detection Using Statistical Characteristics of Correlations between Words

Authors:
Zhili Chen;Liusheng Huang;Zhenshan Yu;Wei Yang;Lingjun Li;Xueling Zheng;Xinxin Zhao
Affiliations:
National High Performance Computing Center at Hefei, Department of Computer Science and Technology, University of Science and Technology of China, Hefei, China 230027;National High Performance Computing Center at Hefei, Department of Computer Science and Technology, University of Science and Technology of China, Hefei, China 230027;National High Performance Computing Center at Hefei, Department of Computer Science and Technology, University of Science and Technology of China, Hefei, China 230027;National High Performance Computing Center at Hefei, Department of Computer Science and Technology, University of Science and Technology of China, Hefei, China 230027;National High Performance Computing Center at Hefei, Department of Computer Science and Technology, University of Science and Technology of China, Hefei, China 230027;National High Performance Computing Center at Hefei, Department of Computer Science and Technology, University of Science and Technology of China, Hefei, China 230027;National High Performance Computing Center at Hefei, Department of Computer Science and Technology, University of Science and Technology of China, Hefei, China 230027
Venue:
Information Hiding
Year:
2008

Citing 2
Cited 0

Foundations of statistical natural language processing

Foundations of statistical natural language processing
A Practical and Effective Approach to Large-Scale Automated Linguistic Steganography

ISC '01 Proceedings of the 4th International Conference on Information Security

Quantified Score

Hi-index	0.00

Visualization

Abstract

Linguistic steganography is a branch of Information Hiding (IH) using written natural language to conceal secret messages. It plays an important role in Information Security (IS) area. Previous work on linguistic steganography was mainly focused on steganography and there were few researches on attacks against it. In this paper, a novel statistical algorithm for linguistic steganography detection is presented. We use the statistical characteristics of correlations between the general service words gathered in a dictionary to classify the given text segments into stego-text segments and normal text segments. In the experiment of blindly detecting the three different linguistic steganography approaches: Markov-Chain-Based, NICETEXT and TEXTO, the total accuracy of discovering stego-text segments and normal text segments is found to be 97.19%. Our results show that the linguistic steganalysis based on correlations between words is promising.