Neural recognition and genetic features selection for robust detection of e-mail spam

  • Authors:
  • Dimitris Gavrilis;Ioannis G. Tsoulos;Evangelos Dermatas

  • Affiliations:
  • Electrical & Computer Engineering, University of Patras;Computer Science Department, University of Ioannina;Electrical & Computer Engineering, University of Patras

  • Venue:
  • SETN'06 Proceedings of the 4th Helenic conference on Advances in Artificial Intelligence
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper a method for feature selection and classification of email spam messages is presented. The selection of features is performed in two steps: The selection is performed by measuring their entropy and a fine-tuning selection is implemented using a genetic algorithm. In the classification process, a Radial Basis Function Network is used to ensure robust classification rate even in case of complex cluster structure. The proposed method shows that, when using a two-level feature selection, a better accuracy is achieved than using one-stage selection. Also, the use of a lemmatizer or a stop-word list gives minimal classification improvement. The proposed method achieves 96-97% average accuracy when using only 20 features out of 15000.