Steganalysis against substitution-based linguistic steganography based on context clusters

  • Authors:
  • Zhili Chen;Liusheng Huang;Haibo Miao;Wei Yang;Peng Meng

  • Affiliations:
  • NHPCC, Depart. of CS. & Tech., University of Science and Technology of China, Hefei 230027, China and Suzhou Institute for Advanced Study, USTC, Suzhou 215123, China;NHPCC, Depart. of CS. & Tech., University of Science and Technology of China, Hefei 230027, China and Suzhou Institute for Advanced Study, USTC, Suzhou 215123, China;NHPCC, Depart. of CS. & Tech., University of Science and Technology of China, Hefei 230027, China and Suzhou Institute for Advanced Study, USTC, Suzhou 215123, China;NHPCC, Depart. of CS. & Tech., University of Science and Technology of China, Hefei 230027, China and Suzhou Institute for Advanced Study, USTC, Suzhou 215123, China;NHPCC, Depart. of CS. & Tech., University of Science and Technology of China, Hefei 230027, China

  • Venue:
  • Computers and Electrical Engineering
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Linguistic steganalysis has been an increasing interest stimulated by the emerging research area of linguistic steganography during the past few years. However, due to limitations of computer natural language processing capability, linguistic steganalysis is a challenging task. Existing steganalysis methods are inefficient to analyze most substitution-based linguistic steganography methods which preserve the syntactic and semantic correctness of cover texts. This paper provides a new steganalysis scheme against substitution-based linguistic steganography based on context clusters. In this scheme, we introduce context clusters to estimate the context fitness and show how to use the statistics of context fitness values to distinguish between normal texts and stego texts. Finally, under this scheme, we present the steganalysis method for synonym substitution-based linguistic steganography. Our experimental results show that the proposed steganalysis method can analyze synonym substitution-based linguistic steganography efficiently and the steganalysis accuracy reaches as high as 98.86%.