Auto-learning of SMTP TCP transport-layer features for spam and abusive message detection

  • Authors:
  • Georgios Kakavelakis;Robert Beverly;Joel Young

  • Affiliations:
  • Naval Postgraduate School;Naval Postgraduate School;Naval Postgraduate School

  • Venue:
  • LISA'11 Proceedings of the 25th international conference on Large Installation System Administration
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Botnets are a significant source of abusive messaging (spam, phishing, etc) and other types of malicious traffic. A promising approach to help mitigate botnet-generated traffic is signal analysis of transport-layer (i.e. TCP/IP) characteristics, e.g. timing, packet reordering, congestion, and flow-control. Prior work [4] shows that machine learning analysis of such traffic features on an SMTP MTA can accurately differentiate between botnet and legitimate sources. We make two contributions toward the real-world deployment of such techniques: i) an architecture for real-time on-line operation; and ii) auto-learning of the unsupervised model across different environments without human labeling (i.e. training). We present a "SpamFlow" SpamAssassin plugin and the requisite auxiliary daemons to integrate transport-layer signal analysis with a popular open-source spam filter. Using our system, we detail results from a production deployment where our auto-learning technique achieves better than 95 percent accuracy, precision, and recall after reception of ≈ 1,000 emails.