An empirical study of three machine learning methods for spam filtering

  • Authors:
  • Chih-Chin Lai

  • Affiliations:
  • Department of Computer Science and Information Engineering, National University of Tainan, Taiwan 700, Taiwan

  • Venue:
  • Knowledge-Based Systems
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

The increasing volumes of unsolicited bulk e-mail (also known as spam) are bringing more annoyance for most Internet users. Using a classifier based on a specific machine-learning technique to automatically filter out spam e-mail has drawn many researchers' attention. This paper is a comparative study the performance of three commonly used machine learning methods in spam filtering. On the other hand, we try to integrate two spam filtering methods to obtain better performance. A set of systematic experiments has been conducted with these methods which are applied to different parts of an e-mail. Experiments show that using the header only can achieve satisfactory performance, and the idea of integrating disparate methods is a promising way to fight spam.