An Empirical Performance Comparison of Machine Learning Methods for Spam E-Mail Categorization

  • Authors:
  • Chih-Chin Lai;Ming-Chi Tsai

  • Affiliations:
  • National University of Tainan, Taiwan;Shu-Te University, Kaohsiung County, Taiwan

  • Venue:
  • HIS '04 Proceedings of the Fourth International Conference on Hybrid Intelligent Systems
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

The increasing volume of unsolicited bulk e-mail (also known as spam) has generated a need for reliable anti-spam filters. Using a classifier based on machine learning techniques to automatically filter out spam e-mail has drawn many researchers' attention. In this paper, we review some of relevant ideas and do a set of systematic experiments on e-mail categorization, which has been conducted with four machine learning algorithms applied to different parts of e-mail. Experimental results reveal that the header of e-mail provides very useful information for all the machine learning algorithms considered to detect spam e-mail.