Taster's choice: a comparative analysis of spam feeds

Authors:
Andreas Pitsillidis;Chris Kanich;Geoffrey M. Voelker;Kirill Levchenko;Stefan Savage
Affiliations:
University of California, San Diego, San Diego, California, USA;University of Illinois, Chicago, Chicago, Illinois, USA;University of California, San Diego, San Diego, California, USA;University of California, San Diego, San Diego, California, USA;University of California, San Diego, San Diego, California, USA
Venue:
Proceedings of the 2012 ACM conference on Internet measurement conference
Year:
2012

Citing 23
Cited 0

Examining the impact of website take-down on phishing

Proceedings of the anti-phishing working groups 2nd annual eCrime researchers summit
Filtering spam with behavioral blacklisting

Proceedings of the 14th ACM conference on Computer and communications security
Spamscatter: characterizing internet scam hosting infrastructure

SS'07 Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium
On the spam campaign trail

LEET'08 Proceedings of the 1st Usenix Workshop on Large-Scale Exploits and Emergent Threats
Characterizing botnets from email spam records

LEET'08 Proceedings of the 1st Usenix Workshop on Large-Scale Exploits and Emergent Threats
Peeking into spammer behavior from a unique vantage point

LEET'08 Proceedings of the 1st Usenix Workshop on Large-Scale Exploits and Emergent Threats
Behind phishing: an examination of phisher modi operandi

LEET'08 Proceedings of the 1st Usenix Workshop on Large-Scale Exploits and Emergent Threats
Exploiting machine learning to subvert your spam filter

LEET'08 Proceedings of the 1st Usenix Workshop on Large-Scale Exploits and Emergent Threats
Spamming botnets: signatures and characteristics

Proceedings of the ACM SIGCOMM 2008 conference on Data communication
Spamalytics: an empirical analysis of spam marketing conversion

Proceedings of the 15th ACM conference on Computer and communications security
Dynamics of Online Scam Hosting Infrastructure

PAM '09 Proceedings of the 10th International Conference on Passive and Active Network Measurement
Botnet spam campaigns can be long lasting: evidence, implications, and analysis

Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems
Studying spamming botnets using Botlab

NSDI'09 Proceedings of the 6th USENIX symposium on Networked systems design and implementation
Feature based techniques for auto-detection of novel email worms

PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
Spamcraft: an inside look at spam campaign orchestration

LEET'09 Proceedings of the 2nd USENIX conference on Large-scale exploits and emergent threats: botnets, spyware, worms, and more
Spam mitigation using spatio-temporal reputations from blacklist history

Proceedings of the 26th Annual Computer Security Applications Conference
On the effects of registrar-level intervention

LEET'11 Proceedings of the 4th USENIX conference on Large-scale exploits and emergent threats
Click Trajectories: End-to-End Analysis of the Spam Value Chain

SP '11 Proceedings of the 2011 IEEE Symposium on Security and Privacy
Measuring and analyzing search-redirection attacks in the illicit online prescription drug trade

SEC'11 Proceedings of the 20th USENIX conference on Security
Clustering for semi-supervised spam filtering

Proceedings of the 8th Annual Collaboration, Electronic messaging, Anti-Abuse and Spam Conference
A strategic analysis of spam botnets operations

Proceedings of the 8th Annual Collaboration, Electronic messaging, Anti-Abuse and Spam Conference
Support vector machines for spam categorization

IEEE Transactions on Neural Networks
PharmaLeaks: understanding the business of online pharmaceutical affiliate programs

Security'12 Proceedings of the 21st USENIX conference on Security symposium

Quantified Score

Hi-index	0.00

Visualization

Abstract

E-mail spam has been the focus of a wide variety of measurement studies, at least in part due to the plethora of spam data sources available to the research community. However, there has been little attention paid to the suitability of such data sources for the kinds of analyses they are used for. In spite of the broad range of data available, most studies use a single "spam feed" and there has been little examination of how such feeds may differ in content. In this paper we provide this characterization by comparing the contents of ten distinct contemporaneous feeds of spam-advertised domain names. We document significant variations based on how such feeds are collected and show how these variations can produce differences in findings as a result.