Detecting word substitutions: PMI vs. HMM

  • Authors:
  • Dmitri Roussinov;SzeWang Fong;David Skillicorn

  • Affiliations:
  • Arizona State University, Tempe, AZ;Queen's University, Kingston, ON, Canada;Queen's University, Kingston, ON, Canada

  • Venue:
  • SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Those who want to conceal the content of their communications can do so by replacing words that might trigger attention. For example, instead of writing "The bomb is in position", a terrorist may chose to write "The flower is in position." The substituted sentence would sound a bit "odd" for a human reader and it has been shown in prior research that such oddity is detectable by text mining approaches. However, the importance of each component in the suggested oddity detection approach has not been thoroughly investigated. Also, the approach has not been compared with such an obvious candidate for the task as Hidden Markov Models (HMM). In this work, we explore further oddity detection algorithms reported earlier, specifically those based on pointwise mutual information (PMI) and Hidden Markov Models (HMM).