Instance-Based Learning Algorithms
Machine Learning
A practical approach to feature selection
ML92 Proceedings of the ninth international workshop on Machine learning
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss
Machine Learning - Special issue on learning with probabilistic representations
Fault Prediction Modeling for Software Quality Estimation: Comparing Commonly Used Techniques
Empirical Software Engineering
An introduction to variable and feature selection
The Journal of Machine Learning Research
An extensive empirical study of feature selection metrics for text classification
The Journal of Machine Learning Research
Benchmarking Attribute Selection Techniques for Discrete Class Data Mining
IEEE Transactions on Knowledge and Data Engineering
Toward Integrating Feature Selection Algorithms for Classification and Clustering
IEEE Transactions on Knowledge and Data Engineering
IEEE Transactions on Pattern Analysis and Machine Intelligence
Finding the Right Data for Software Cost Modeling
IEEE Software
An introduction to ROC analysis
Pattern Recognition Letters - Special issue: ROC analysis in pattern recognition
Combining feature selectors for text classification
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Predicting Defects for Eclipse
ICSEW '07 Proceedings of the 29th International Conference on Software Engineering Workshops
An Empirical Study of Learning from Imbalanced Data Using Random Forest
ICTAI '07 Proceedings of the 19th IEEE International Conference on Tools with Artificial Intelligence - Volume 02
A Novel GA-Taguchi-Based Feature Selection Method
IDEAL '08 Proceedings of the 9th International Conference on Intelligent Data Engineering and Automated Learning
An Investigation into the Functional Form of the Size-Defect Relationship for Software Modules
IEEE Transactions on Software Engineering
Consensus group stable feature selection
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
IRI'09 Proceedings of the 10th IEEE international conference on Information Reuse & Integration
Variance analysis in software fault prediction models
ISSRE'09 Proceedings of the 20th IEEE international conference on software reliability engineering
Choosing software metrics for defect prediction: an investigation on feature selection techniques
Software—Practice & Experience
Estimating continuous distributions in Bayesian classifiers
UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence
STochFS: a framework for combining feature selection outcomes through a stochastic process
PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases
Hi-index | 0.01 |
Software defect prediction models are used to identify program modules that are high-risk, or likely to have a high number of faults. These models are built using software metrics which are collected during the software development process. Various techniques and approaches have been created for improving fault predictions. One of these is feature (metric) selection. Choosing the most important features is important to improve the effectiveness of defect predictors. However, using a single feature subset selection method may generate local optima. Ensembles of feature selection methods attempt to combine multiple feature selection methods instead of using a single one. In this paper, we present a comprehensive empirical study examining 17 different ensembles of feature ranking techniques (rankers) including six commonly used feature ranking techniques, the signal-to-noise filter technique, and 11 threshold-based feature ranking techniques. This study utilized 16 real-world software measurement data sets of different sizes and built 54,400 classification models using four well known classifiers. The main conclusion is that ensembles of very few rankers are very effective and even better than ensembles of many or all rankers.