A Comparative Impact Study of Attribute Selection Techniques on Naïve Bayes Spam Filters

  • Authors:
  • J. R. Méndez;I. Cid;D. Glez-Peña;M. Rocha;F. Fdez-Riverola

  • Affiliations:
  • Dept. Informática, University of Vigo, Escuela Superior de Ingeniería Informática Edificio Politécnico, Ourense, Spain 32004;Dept. Informática, University of Vigo, Escuela Superior de Ingeniería Informática Edificio Politécnico, Ourense, Spain 32004;Dept. Informática, University of Vigo, Escuela Superior de Ingeniería Informática Edificio Politécnico, Ourense, Spain 32004;Dept. Informática, University of Minho, Centro de Ciências e Tecnologias da Computação., Braga, Portugal 4710-057;Dept. Informática, University of Vigo, Escuela Superior de Ingeniería Informática Edificio Politécnico, Ourense, Spain 32004

  • Venue:
  • ICDM '08 Proceedings of the 8th industrial conference on Advances in Data Mining: Medical Applications, E-Commerce, Marketing, and Theoretical Aspects
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The main problem of Internet e-mail service is the massive spam message delivery. Everyday, millions of unwanted and unhelpful messages are received by Internet users annoying their mailboxes. Fortunately, nowadays there are different kinds of filters able to automatically identify and delete most of these messages. In order to reduce the bulk of information to deal with, only distinctive attributes are selected spam and legitimate e-mails. This work presents a comparative study about the performance of five well-known feature selection techniques when they are applied in conjunction with four different types of Naïve Bayes classifier. The results obtained from the experiments carried out show the relevance of choosing an appropriate feature selection technique in order to obtain the most accurate results.