ECUE: A Spam Filter that Uses Machine Learning to Track Concept Drift

Authors:
Sarah Jane Delany;Pádraig Cunningham;Barry Smyth
Affiliations:
Dublin Institute of Technology, Kevin St., Dublin 8, Ireland, email: sarahjane.delany@comp.dit.ie;Trinity College Dublin, Dublin 2, Ireland, email: padraig.cunningham@cs.tcd.ie;University College Dublin, Dublin 4, Ireland, email: barry.smyth@cs.ucd.ie
Venue:
Proceedings of the 2006 conference on ECAI 2006: 17th European Conference on Artificial Intelligence August 29 -- September 1, 2006, Riva del Garda, Italy
Year:
2006

Citing 11
Cited 8

C4.5: programs for machine learning

C4.5: programs for machine learning
Internet e-mail: Protocols, Standards, and Implementation

Internet e-mail: Protocols, Standards, and Implementation
A Memory-Based Approach to Anti-Spam Filtering for Mailing Lists

Information Retrieval
Diagnosis and Decision Support

Case-Based Reasoning Technology, From Foundations to Applications
Using latent semantic indexing to filter spam

Proceedings of the 2003 ACM symposium on Applied computing
An evaluation of statistical spam filtering techniques

ACM Transactions on Asian Language Information Processing (TALIP)
A LVQ-based neural network anti-spam email approach

ACM SIGOPS Operating Systems Review
Combining text and heuristics for cost-sensitive spam filtering

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
A case-based technique for tracking concept drift in spam filtering

Knowledge-Based Systems
Generating estimates of classification confidence for a case-based spam filter

ICCBR'05 Proceedings of the 6th international conference on Case-Based Reasoning Research and Development
Support vector machines for spam categorization

IEEE Transactions on Neural Networks

Mining competent case bases for case-based reasoning

Artificial Intelligence
Textual case-based reasoning for spam filtering: a comparison of feature-based and feature-free approaches

Artificial Intelligence Review
Catching the Drift: Using Feature-Free Case-Based Reasoning for Spam Filtering

ICCBR '07 Proceedings of the 7th international conference on Case-Based Reasoning: Case-Based Reasoning Research and Development
Assessing Classification Accuracy in the Revision Stage of a CBR Spam Filtering System

ICCBR '07 Proceedings of the 7th international conference on Case-Based Reasoning: Case-Based Reasoning Research and Development
Adaptive Spam Detection Inspired by a Cross-Regulation Model of Immune Dynamics: A Study of Concept Drift

ICARIS '08 Proceedings of the 7th international conference on Artificial Immune Systems
Improving tweet stream classification by detecting changes in word probability

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Tracking concept drift in malware families

Proceedings of the 5th ACM workshop on Security and artificial intelligence
Integrated instance- and class-based generative modeling for text classification

Proceedings of the 18th Australasian Document Computing Symposium

Quantified Score

Hi-index	0.00

Visualization

Abstract

While text classification has been identified for some time as a promising application area for Artificial Intelligence, so far few deployed applications have been described. In this paper we present a spam filtering system that uses example-based machine learning techniques to train a classifier from examples of spam and legitimate email. This approach has the advantage that it can personalise to the specifics of the user's filtering preferences. This classifier can also automatically adjust over time to account for the changing nature of spam (and indeed changes in the profile of legitimate email). A significant software engineering challenge in developing this system was to ensure that it could interoperate with existing email systems to allow easy managment of the training data over time. This system has been deployed and evaluated over an extended period and the results of this evaluation are presented here.