Measuring stability of feature ranking techniques: a noise-based approach
International Journal of Business Intelligence and Data Mining
Hi-index | 0.00 |
The application of feature ranking to software engineering datasets is rare at best. In this study, we consider wrapper-based feature ranking where nine performance metrics aided by a particular learner are evaluated. We consider five learners and take two different approaches, each in conjunction with one of two different methodologies: 3-fold Cross-Validation (CV) and 3-fold Cross-Validation Risk Impact (CV-R). The classifiers are Naıve Bayes (NB), Multi Layer Perceptron (MLP), k- Nearest Neighbors (kNN), Support Vector Machines (SVM), and Logistic Regression (LR). The performance metrics used as ranking techniques are Overall Accuracy (OA), F-Measure(FM), Geometric Mean (GM), Arithmetic Mean (AM), Area under ROC (AUC), Area under PRC (PRC), Best F-Measure (BFM), Best Geometric Mean (BGM), and Best Arithmetic Mean (BAM). To evaluate the classifier performance after feature selection has been applied, we use AUC as the performance evaluator. This paper represents a preliminary report on our proposed wrapper-based feature ranking approach to software defect prediction problems.