Applying interestingness measures to Ansar forum texts

  • Authors:
  • D. B. Skillicorn

  • Affiliations:
  • Queen's University, Canada

  • Venue:
  • ACM SIGKDD Workshop on Intelligence and Security Informatics
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Documents from the Ansar aljihad forum are ranked using a number of word-usage models. Analysis of overall content shows that postings fall strongly into two categories. A model describing Salafist-jihadi content generates a very clear single-factor ranking of postings. This ranking could be interpreted as selecting the most radical postings, and so could direct analyst attention to the most significant documents. A model for deception creates a multifactor ranking that produces a similar ordering, with low-deception postings identified with highly Salafist-jihadi ones. This suggests either that such postings are extremely sincere, or that personal pronoun use and intricate structuring are also markers of Salafist-jihadi language. Although the overall approach is relatively straightforward, the choice of parameters to maximize the usefulness of the results is intricate.