Multi-lingual detection of terrorist content on the web

  • Authors:
  • Mark Last;Alex Markov;Abraham Kandel

  • Affiliations:
  • Department of Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva, Israel;Department of Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva, Israel;Department of Computer Science and Engineering, University of South Florida, Tampa, FL

  • Venue:
  • WISI'06 Proceedings of the 2006 international conference on Intelligence and Security Informatics
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Since the web is increasingly used by terrorist organizations for propaganda, disinformation, and other purposes, the ability to automatically detect terrorist-related content in multiple languages can be extremely useful. In this paper we describe a new, classification-based approach to multi-lingual detection of terrorist documents. The proposed approach builds upon the recently developed graph-based web document representation model combined with the popular C4.5 decision-tree classification algorithm. Evaluation is performed on a collection of 648 web documents in Arabic language. The results demonstrate that documents downloaded from several known terrorist sites can be reliably discriminated from the content of Arabic news reports using a simple decision tree.