Mining e-mail content for author identification forensics

  • Authors:
  • O. de Vel;A. Anderson;M. Corney;G. Mohay

  • Affiliations:
  • Defence Science and Technology Organisation, Salisbury, Australia;Queensland University of Technology, Brisbane, Australia;Queensland University of Technology, Brisbane, Australia;Queensland University of Technology, Brisbane, Australia

  • Venue:
  • ACM SIGMOD Record
  • Year:
  • 2001

Quantified Score

Hi-index 0.01

Visualization

Abstract

We describe an investigation into e-mail content mining for author identification, or authorship attribution, for the purpose of forensic investigation. We focus our discussion on the ability to discriminate between authors for the case of both aggregated e-mail topics as well as across different e-mail topics. An extended set of e-mail document features including structural characteristics and linguistic patterns were derived and, together with a Support Vector Machine learning algorithm, were used for mining the e-mail content. Experiments using a number of e-mail documents generated by different authors on a set of topics gave promising results for both aggregated and multi-topic author categorisation.