Combining text and heuristics for cost-sensitive spam filtering

  • Authors:
  • José M. Gómez Hidalgo;Manuel Maña López;Enrique Puertas Sanz

  • Affiliations:
  • Universidad Europea-CEES, Spain;Universidad de Vigo, Spain;Universidad Europea-CEES, Spain

  • Venue:
  • ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
  • Year:
  • 2000

Quantified Score

Hi-index 0.01

Visualization

Abstract

Spam filtering is a text categorization task that shows especial features that make it interesting and difficult. First, the task has been performed traditionally using heuristics from the domain. Second, a cost model is required to avoid misclassification of legitimate messages. We present a comparative evaluation of several machine learning algorithms applied to spam filtering, considering the text of the messages and a set of heuristics for the task. Cost-oriented biasing and evaluation is performed.