On the effectiveness of IP reputation for spam filtering

  • Authors:
  • Holly Esquivel;Aditya Akella;Tatsuya Mori

  • Affiliations:
  • University of Wisconsin-Madison;University of Wisconsin-Madison;NTT Service Integration Laboratories

  • Venue:
  • COMSNETS'10 Proceedings of the 2nd international conference on COMmunication systems and NETworks
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Modern SMTP servers apply a variety of mechanisms to stem the volume of spam delivered to users. These techniques can be broadly classified into two categories: preacceptance approaches, which apply prior to a message being accepted (e.g. IP reputation), and post-acceptance techniques which apply after a message has been accepted (e.g. content based signatures). We argue that the effectiveness of these measures varies based on the SMTP sender type. This paper focuses on the most light-weight pre-acceptance filtering mechanism - IP reputation. We first classify SMTP senders into three main categories: legitimate servers, end-hosts, and spam gangs, and empirically study the limits of effectiveness regarding IP reputation filtering for each category. Next, we develop new techniques that build custom IP reputation lists, which significantly improve the performance of existing IP reputation lists. In compiling these lists, we leverage a somewhat surprising fact that both legitimate domains and spam domains often use the DNS Sender Policy Framework (SPF) in an attempt to pass simple authentication checks. That is, good/bad IP addresses can be systematically compiled by collecting good/bad domains and looking up their SPF resource records. We also evaluate the effectiveness of these lists over time. Finally, we aim to understand the characteristics of the three categories of email senders in depth. Overall, we find that it is possible to construct IP reputation lists that can identify roughly 90% of all spam and legitimate mail, but some of the lists, i.e. the lists for spam gangs, must be updated on a constant basis to maintain this high level of accuracy.