A hybrid Gini PSO-SVM feature selection based on Taguchi method: an evaluation on email filtering

  • Authors:
  • Noormadinah Allias;Megat Norulazmi Megat;Mohamed Noor;Mohd. Nazri Ismail

  • Affiliations:
  • Universiti Kuala Lumpur, Kuala Lumpur, Malaysia;Universiti Kuala Lumpur, Kuala Lumpur, Malaysia;Universiti Kuala Lumpur, Kuala Lumpur, Malaysia;National Defence University of Malaysia, Kem Perdana Sungai Besi, Kuala Lumpur

  • Venue:
  • Proceedings of the 8th International Conference on Ubiquitous Information Management and Communication
  • Year:
  • 2014

Quantified Score

Hi-index 0.00

Visualization

Abstract

The flooding of spam emails in email server is an arm- race issue. Even until today, filtering spam from email messages has still become as an ongoing work by researchers. Among all of the methods proposed, methods by using machine learning algorithms have achieved more success in spam filtering. Unfortunately in machine learning, a high dimensionality of features space after preprocessing became as a big hurdle for the classifier. Not only high dimensionality issues, the excessive number of features also can degrade the classification results. Thus in this paper, we proposed two stages of feature selection based on Taguchi methods to reduce the high dimensionality of features and obtain a good classification result for spam filtering. Firstly, we apply Gini Index feature selection to reduce the dimension of terms; and then we applied Taguchi method to assist Gini Index and PSO-SVM in selecting the best combination of parameter settings. This method is trained and tested on a Lingspam dataset. The performance of the proposed method is compared with the traditional feature selection and current work by another researcher. The result showed that our proposed method produced a good precision and recall result with the lowest number of features.