The Impact of Noise in Spam Filtering: A Case Study

  • Authors:
  • I. Cid;L. R. Janeiro;J. R. Méndez;D. Glez-Peña;F. Fdez-Riverola

  • Affiliations:
  • Dept. Informática, University of Vigo, Escuela Superior de Ingeniería Informática Edificio Politécnico, Ourense, Spain 32004;Dept. Informática, University of Vigo, Escuela Superior de Ingeniería Informática Edificio Politécnico, Ourense, Spain 32004;Dept. Informática, University of Vigo, Escuela Superior de Ingeniería Informática Edificio Politécnico, Ourense, Spain 32004;Dept. Informática, University of Vigo, Escuela Superior de Ingeniería Informática Edificio Politécnico, Ourense, Spain 32004;Dept. Informática, University of Vigo, Escuela Superior de Ingeniería Informática Edificio Politécnico, Ourense, Spain 32004

  • Venue:
  • ICDM '08 Proceedings of the 8th industrial conference on Advances in Data Mining: Medical Applications, E-Commerce, Marketing, and Theoretical Aspects
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Unsolicited commercial e-mail (UCE), more commonly known as spam is a growing problem on the Internet. Every day people receive lots of unwanted advertising e-mails that flood their mailboxes. Fortunately, there are several approaches for spam filtering able to detect and automatically delete this kind of messages. However, spammers have adopted some techniques to reduce the effectiveness of these filters by introducing noise in their messages. This work presents a new pre-processing technique for noise identification and reduction, showing preliminary results when it is applied with a Flexible Bayes classifier. The experimental analysis confirms the advantages of using the proposed technique in order to improve spam filters accuracy.