A note on genetic algorithms for large-scale feature selection
Pattern Recognition Letters
Case-based reasoning
Selected papers of the sixth annual Oregon workshop on Software metrics
BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Artificial Intelligence Review - Special issue on lazy learning
LOF: identifying density-based local outliers
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Two-phase clustering process for outliers detection
Pattern Recognition Letters
Accuracy of software quality models over multiple releases
Annals of Software Engineering
Controlling Overfitting in Classification-Tree Models ofSoftware Quality
Empirical Software Engineering
Squeezer: an efficient algorithm for clustering categorical data
Journal of Computer Science and Technology
Further Research on Feature Selection and Classification Using Genetic Algorithms
Proceedings of the 5th International Conference on Genetic Algorithms
Algorithms for Mining Distance-Based Outliers in Large Datasets
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Efficient and Effective Clustering Methods for Spatial Data Mining
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Discovering cluster-based local outliers
Pattern Recognition Letters
METRICS '01 Proceedings of the 7th International Symposium on Software Metrics
Software Quality Classification Modeling Using The SPRINT Decision Tree Algorithm
ICTAI '02 Proceedings of the 14th IEEE International Conference on Tools with Artificial Intelligence
Detection of software modules with high debug code churn in a very large legacy system
ISSRE '96 Proceedings of the The Seventh International Symposium on Software Reliability Engineering
Fuzzy logic techniques for software reliability engineering
Fuzzy logic techniques for software reliability engineering
Review: Software fault prediction: A literature review and current trends
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
Software quality prediction models use the software metrics and fault data collected from previous software releases or similar projects to predict the quality of software components in development. Previous research has shown that this kind of models can yield predictions with impressive accuracy. However, building accurate software quality prediction model is still challenging for following two reasons. Firstly, the outliers in software data often have a disproportionate effect on the overalls predictive ability of the model. Secondly, not all collected software metrics should be used to construct model because of the curse of dimension. To resolve these two problems, we present a new software quality prediction model based on genetic algorithm (GA) in which outlier detection and feature selection are executed simultaneously. The experimental results illustrate this model performs better than some latest raised software quality prediction models based on S-PLUS and TreeDisc. Furthermore, the clustered software components and selected features are easier for software engineers and data analysts to study and interpret.