Authorship similarity detection from email messages

  • Authors:
  • Xiaoling Chen;Peng Hao;R. Chandramouli;K. P. Subbalakshmi

  • Affiliations:
  • Department of Electrical and Computer Engineering, Stevens Institute of Technology, Hoboken, NJ;Department of Electrical and Computer Engineering, Stevens Institute of Technology, Hoboken, NJ;Department of Electrical and Computer Engineering, Stevens Institute of Technology, Hoboken, NJ;Department of Electrical and Computer Engineering, Stevens Institute of Technology, Hoboken, NJ

  • Venue:
  • MLDM'11 Proceedings of the 7th international conference on Machine learning and data mining in pattern recognition
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

It is easy to hide the true identity of the author of an email. The author's actual name, email address, etc. can be changed arbitrarily to deceive an email receiver. For example, a sender can change his/her identity in the email header to send different emails to various recipients. Therefore, in this paper, we investigate techniques for authorship similarity detection from the text content of a short length, topic-free email. 150 stylistic cues are identified for this problem. A frequent pattern and machine learning based method is proposed. Extensive experiment results are also presented for the Enron email data set.