An introduction to ROC analysis
Pattern Recognition Letters - Special issue: ROC analysis in pattern recognition
RepuScore: collaborative reputation management framework for email infrastructure
LISA'07 Proceedings of the 21st conference on Large Installation System Administration Conference
Inferring Spammers in the Network Core
PAM '09 Proceedings of the 10th International Conference on Passive and Active Network Measurement
IPGroupRep: A Novel Reputation Based System for Anti-Spam
UIC-ATC '09 Proceedings of the 2009 Symposia and Workshops on Ubiquitous, Autonomic and Trusted Computing
A collaboration-based autonomous reputation system for email services
INFOCOM'10 Proceedings of the 29th conference on Information communications
Filtering spam from bad neighborhoods
International Journal of Network Management
Spam mitigation using spatio-temporal reputations from blacklist history
Proceedings of the 26th Annual Computer Security Applications Conference
Building a dynamic reputation system for DNS
USENIX Security'10 Proceedings of the 19th USENIX conference on Security
Hi-index | 0.00 |
IP reputation is a common technique to address email spam problem and while there are commercial implementations available, the algorithms behind them are confidential. A few open source implementations (gossip, RepuScore, IP-GroupREP, etc.) are available, but few studies compare their commercial counterparts. For this reason, we have made an empirical comparison of six popular commercial IP reputation databases and three different open-source IP reputation algorithms. We built our own IP reputation database from our email corpus, containing 931,576 email messages from real-time email traffic at an academic ISP. After we processed and classified the corpus, we compared the open-source IP reputation algorithms' results with commercial IP reputation databases by using the Spearman rank correlation coefficient to identify the optimal parameters for open-source algorithms. The results show lower correlation coefficients when the frequency of emails from a single IP is rising. Open-source algorithms performed sufficiently for IP numbers with more than five and less than 50 emails from a single IP, while (surprisingly) the correlation dropped with a higher number of emails from a single IP. For this reason, we believe there should be some additional fine-tuning of open-source algorithms to make them comparable to their commercial counterparts that have IP reputation scores built from many sensors around the world. We also compared commercial IP reputation databases and found mixed correlations between them, which raised many questions regarding the algorithms used for building IP reputation scores. The research also identified the problem of finding a good methodology for comparing IP reputation databases.