Evaluating cost-sensitive Unsolicited Bulk Email categorization
Proceedings of the 2002 ACM symposium on Applied computing
Support vector machines for spam categorization
IEEE Transactions on Neural Networks
SF-HME system: a hierarchical mixtures-of-experts classification system for spam filtering
Proceedings of the 2006 ACM symposium on Applied computing
Catching spam before it arrives: domain specific dynamic blacklists
ACSW Frontiers '06 Proceedings of the 2006 Australasian workshops on Grid computing and e-research - Volume 54
Using word similarity to eradicate junk emails
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Collaborative spam filtering with heterogeneous agents
Expert Systems with Applications: An International Journal
Journal of Computer Security
Computational Stylometry: Who's in a Play?
Verbal and Nonverbal Features of Human-Human and Human-Machine Interaction
Using the self organizing map for clustering of text documents
Expert Systems with Applications: An International Journal
Targeting spam control on middleboxes: Spam detection based on layer-3 e-mail content classification
Computer Networks: The International Journal of Computer and Telecommunications Networking
Journal of Computer Security - Best papers of the Sec Track at the 2006 ACM Symposium
A survey of learning-based techniques of email spam filtering
Artificial Intelligence Review
Expert Systems with Applications: An International Journal
An immunological filter for spam
ICARIS'06 Proceedings of the 5th international conference on Artificial Immune Systems
Expert Systems with Applications: An International Journal
Genetic optimized artificial immune system in spam detection: a review and a model
Artificial Intelligence Review
Computer Networks: The International Journal of Computer and Telecommunications Networking
Hi-index | 0.01 |
We compare two statistical methods for identifying spam or junk electronic mail. Spam filters are classifiers which determine whether an email is junk or not. The proliferation of spam email has made electronic filtering vitally important. The magnitude of the problem is discussed. We examine the Naive Bayesian method in relation to the 'Chi by degrees of Freedom' approach, the latter used in the field of authorship identification. Both methods produce very promising results. However, the 'Chi by degrees of Freedom' has the advantage of providing significance measures, which will help to reduce false positives. Statistics based on character-level tokenization proves more effective than word-level.