An Empirical Performance Comparison of Machine Learning Methods for Spam E-Mail Categorization

Authors:
Chih-Chin Lai;Ming-Chi Tsai
Affiliations:
National University of Tainan, Taiwan;Shu-Te University, Kaohsiung County, Taiwan
Venue:
HIS '04 Proceedings of the Fourth International Conference on Hybrid Intelligent Systems
Year:
2004

Citing 0
Cited 7

Behavior-based spam detection using a hybrid method of rule-based techniques and neural networks

Expert Systems with Applications: An International Journal
Spam Filtering: the Influence of the Temporal Distribution of Training Data

Proceedings of the 2006 conference on STAIRS 2006: Proceedings of the Third Starting AI Researchers' Symposium
A survey of learning-based techniques of email spam filtering

Artificial Intelligence Review
A neural tree and its application to spam e-mail detection

Expert Systems with Applications: An International Journal
An effective spam filter based on a combined support vector machine approach

International Journal of Internet Technology and Secured Transactions
On feature extraction for spam e-mail detection

MRCS'06 Proceedings of the 2006 international conference on Multimedia Content Representation, Classification and Security
Flexible Algorithm Selection Framework for Large Scale Metalearning

WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01

Quantified Score

Hi-index	0.00

Visualization

Abstract

The increasing volume of unsolicited bulk e-mail (also known as spam) has generated a need for reliable anti-spam filters. Using a classifier based on machine learning techniques to automatically filter out spam e-mail has drawn many researchers' attention. In this paper, we review some of relevant ideas and do a set of systematic experiments on e-mail categorization, which has been conducted with four machine learning algorithms applied to different parts of e-mail. Experimental results reveal that the header of e-mail provides very useful information for all the machine learning algorithms considered to detect spam e-mail.