Towards the effective temporal association mining of spam blacklists

Authors:
Andrew G. West;Insup Lee
Affiliations:
University of Pennsylvania, Philadelphia, PA;University of Pennsylvania, Philadelphia, PA
Venue:
Proceedings of the 8th Annual Collaboration, Electronic messaging, Anti-Abuse and Spam Conference
Year:
2011

Citing 28
Cited 1

Syntactic clustering of the Web

Selected papers from the sixth international conference on World Wide Web
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Mining Sequential Patterns: Generalizations and Performance Improvements

EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
Mining Sequential Patterns

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Discovering Temporal Association Rules: Algorithms, Language and System

ICDE '00 Proceedings of the 16th International Conference on Data Engineering
An empirical study of spam traffic and the use of DNS black lists

Proceedings of the 4th ACM SIGCOMM conference on Internet measurement
Research issues in data stream association rule mining

ACM SIGMOD Record
Interestingness measures for data mining: A survey

ACM Computing Surveys (CSUR)
Understanding the network-level behavior of spammers

Proceedings of the 2006 conference on Applications, technologies, architectures, and protocols for computer communications
How dynamic are IP addresses?

Proceedings of the 2007 conference on Applications, technologies, architectures, and protocols for computer communications
Filtering spam with behavioral blacklisting

Proceedings of the 14th ACM conference on Computer and communications security
Spamscatter: characterizing internet scam hosting infrastructure

SS'07 Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium
Exploiting network structure for proactive spam mitigation

SS'07 Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium
Characterizing botnets from email spam records

LEET'08 Proceedings of the 1st Usenix Workshop on Large-Scale Exploits and Emergent Threats
Spamming botnets: signatures and characteristics

Proceedings of the ACM SIGCOMM 2008 conference on Data communication
Spamalytics: an empirical analysis of spam marketing conversion

Proceedings of the 15th ACM conference on Computer and communications security
Ether: malware analysis via hardware virtualization extensions

Proceedings of the 15th ACM conference on Computer and communications security
BotMiner: clustering analysis of network traffic for protocol- and structure-independent botnet detection

SS'08 Proceedings of the 17th conference on Security symposium
Botnet spam campaigns can be long lasting: evidence, implications, and analysis

Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems
Studying spamming botnets using Botlab

NSDI'09 Proceedings of the 6th USENIX symposium on Networked systems design and implementation
Similarity-Profiled Temporal Association Mining

IEEE Transactions on Knowledge and Data Engineering
Your botnet is my botnet: analysis of a botnet takeover

Proceedings of the 16th ACM conference on Computer and communications security
An efficient algorithm for incremental mining of temporal association rules

Data & Knowledge Engineering
Detecting spammers with SNARE: spatio-temporal network-level automatic reputation engine

SSYM'09 Proceedings of the 18th conference on USENIX security symposium
Spam mitigation using spatio-temporal reputations from blacklist history

Proceedings of the 26th Annual Computer Security Applications Conference
BotGrep: finding P2P bots with structured graph analysis

USENIX Security'10 Proceedings of the 19th USENIX conference on Security
The underground economy of spam: a botmaster's perspective of coordinating large-scale spam campaigns

LEET'11 Proceedings of the 4th USENIX conference on Large-scale exploits and emergent threats

Adaptive blacklist-based packet filter with a statistic-based approach in network intrusion detection

Journal of Network and Computer Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

IP blacklists are a well-regarded anti-spam mechanism that capture global spamming patterns. These properties make such lists a practical ground-truth by which to study email spam behaviors. Observing one blacklist for nearly a year-and-a-half, we collected data on roughly half a billion listing events. In this paper, that data serves two purposes. First, we conduct a measurement study on the dynamics of blacklists and email spam at-large. The magnitude/duration of the data enables scrutiny of long-term trends, at scale. Further, these statistics help parameterize our second task: the mining of blacklist history for temporal association rules. That is, we search for IP addresses with correlated histories. Strong correlations would suggest group members are not independent entities and likely share botnet membership. Unfortunately, we find that statistically significant groupings are rare. This result is reinforced when rules are evaluated in terms of their ability to: (1) identify shared botnet members, using ground-truth from botnet infiltrations and sinkholes, and (2) predict future blacklisting events. In both cases, performance improvements over a control classifier are nominal. This outcome forces us to re-examine the appropriateness of blacklist data for this task, and suggest refinements to our mining model that may allow it to better capture the dynamics by which botnets operate.