Empirical studies on feature selection for software fault prediction

Authors:
Jiaqiang Chen;Shulong Liu;Xiang Chen;Qing Gu;Daoxu Chen
Affiliations:
Nanjing University, Nanjing, China;Nanjing University, Nanjing, China;Nanjing University, Nanjing, China and Nantong University, Nantong, China;Nanjing University, Nanjing, China;Nanjing University, Nanjing, China
Venue:
Proceedings of the 5th Asia-Pacific Symposium on Internetware
Year:
2013

Citing 6
Cited 0

Introduction to Data Mining, (First Edition)

Introduction to Data Mining, (First Edition)
Data Mining Static Code Attributes to Learn Defect Predictors

IEEE Transactions on Software Engineering
Benchmarking Classification Models for Software Defect Prediction: A Proposed Framework and Novel Findings

IEEE Transactions on Software Engineering
A Comparative Study of Ensemble Feature Selection Techniques for Software Defect Prediction

ICMLA '10 Proceedings of the 2010 Ninth International Conference on Machine Learning and Applications
Choosing software metrics for defect prediction: an investigation on feature selection techniques

Software—Practice & Experience
A General Software Defect-Proneness Prediction Framework

IEEE Transactions on Software Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Classification based software fault prediction methods aim to classify the modules into either fault-prone or non-fault-prone. Feature selection is a preprocess step used to improve the data quality. However most of previous research mainly focus on feature relevance analysis, there is little work focusing on feature redundancy analysis. Therefore we propose a two-stage framework for feature selection to solve this issue. In particular, during the feature relevance phase, we adopt three different relevance measures to obtain the relevant feature subset. Then during the feature redundancy analysis phase, we use a cluster-based method to eliminate redundant features. To verify the effectiveness of our proposed framework, we choose typical real-world software projects, including Eclipse projects and NASA software project KC1. Final empirical result shows the effectiveness of our proposed framework.