Effect of feature selection methods on machine learning classifiers for detecting email spams

  • Authors:
  • Shrawan Kumar Trivedi;Shubhamoy Dey

  • Affiliations:
  • Indian Institute of Management Prabandh Shikhar, Rau Indore, India;Indian Institute of Management Prabandh Shikhar, Rau Indore, India

  • Venue:
  • Proceedings of the 2013 Research in Adaptive and Convergent Systems
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

This research presents the effects of using features selected by two feature selection methods i.e. Genetic Search and Greedy Stepwise Search on popular Machine Learning Classifiers like Bayesian, Naive Bayes, Support Vector Machine and Genetic Algorithm. Tests were performed on two different publicly available spam email datasets: "Enron" and "SpamAssassin". Results show that, Greedy Stepwise Search is a good method for feature selection for spam email detection. Among the Machine Learning Classifiers, Support Vector Machine has been found to be the best both in terms of accuracy and False Positive rate