A study of feature subset evaluators and feature subset searching methods for phishing classification

  • Authors:
  • Mahmoud Khonji;Andrew Jones;Youssef Iraqi

  • Affiliations:
  • Khalifa University, Sharjah, UAE;Khalifa University, Sharjah, UAE and Edith Cowan University;Khalifa University, Sharjah, UAE

  • Venue:
  • Proceedings of the 8th Annual Collaboration, Electronic messaging, Anti-Abuse and Spam Conference
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Phishing is a semantic attack that aims to take advantage of the naivety of users of electronic services (e.g. e-banking). A number of solutions have been proposed to minimize the impact of phishing attacks. The most accurate email phishing classifiers, that are publicly known, use machine learning techniques. Previous work in phishing email classification via machine learning have primarily focused on enhancing the classification accuracy by studying the addition of novel features, ensembles, or classification algorithms. This study follows a different path by taking advantage of previously proposed features. The primary focus of this paper is to enhance the classification accuracy of phishing email classifiers by finding an effective feature subset out of a number of previously proposed features, by evaluating various feature selection methods. The selected feature subset in this study resulted in a classification model with an f1 score of 99.396% for 21 heuristic features and a single classifier.