Wrapper-Based Feature Ranking for Software Engineering Metrics

  • Authors:
  • Wilker Altidor;Taghi M. Khoshgoftaar;Amri Napolitano

  • Affiliations:
  • -;-;-

  • Venue:
  • ICMLA '09 Proceedings of the 2009 International Conference on Machine Learning and Applications
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

The application of feature ranking to software engineering datasets is rare at best. In this study, we consider wrapper-based feature ranking where nine performance metrics aided by a particular learner are evaluated. We consider five learners and take two different approaches, each in conjunction with one of two different methodologies: 3-fold Cross-Validation (CV) and 3-fold Cross-Validation Risk Impact (CV-R). The classifiers are Naıve Bayes (NB), Multi Layer Perceptron (MLP), k- Nearest Neighbors (kNN), Support Vector Machines (SVM), and Logistic Regression (LR). The performance metrics used as ranking techniques are Overall Accuracy (OA), F-Measure(FM), Geometric Mean (GM), Arithmetic Mean (AM), Area under ROC (AUC), Area under PRC (PRC), Best F-Measure (BFM), Best Geometric Mean (BGM), and Best Arithmetic Mean (BAM). To evaluate the classifier performance after feature selection has been applied, we use AUC as the performance evaluator. This paper represents a preliminary report on our proposed wrapper-based feature ranking approach to software defect prediction problems.