Analyzing the Performance of Spam Filtering Methods When Dimensionality of Input Vector Changes

  • Authors:
  • J. R. Méndez;B. Corzo;D. Glez-Peña;F. Fdez-Riverola;F. Díaz

  • Affiliations:
  • Computer Science Dept., University of Vigo, Escuela Superior de Ingeniería Informática, Edificio Politécnico, Campus Universitario As Lagoas s/n, 32004, Ourense, Spain;Dept. Advertising Graphics, Arts College of Oviedo, C/ Julián Clavería, 12, 33006, Oviedo, Spain;Computer Science Dept., University of Vigo, Escuela Superior de Ingeniería Informática, Edificio Politécnico, Campus Universitario As Lagoas s/n, 32004, Ourense, Spain;Computer Science Dept., University of Vigo, Escuela Superior de Ingeniería Informática, Edificio Politécnico, Campus Universitario As Lagoas s/n, 32004, Ourense, Spain;Computer Science Dept., University of Valladolid, Escuela Universitaria de Informática, Plaza Santa Eulalia, 9-11, 40005, Segovia, Spain

  • Venue:
  • MLDM '07 Proceedings of the 5th international conference on Machine Learning and Data Mining in Pattern Recognition
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Spam is a complex problem that makes difficult the exploitation of Internet resources. In this sense, several authorities have alerted about the dimension of this problem and aim everybody to fight against it. In this paper we present an extensive analysis showing how the effect of changing the dimensionality of message representation influences the accuracy of some well-known classical spam filtering techniques. The conclusions drawn from the experiments carried out will be useful for building a comparison of the dimensionality reorganization effects between classical filtering techniques and a successful spam filter model called SpamHunting.