Software Metrics Model For Quality Control
METRICS '97 Proceedings of the 4th International Symposium on Software Metrics
Unsupervised word sense disambiguation rivaling supervised methods
ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Understanding the Yarowsky Algorithm
Computational Linguistics
Software quality estimation with limited fault data: a semi-supervised learning perspective
Software Quality Control
Sample-based software defect prediction with active and semi-supervised learning
Automated Software Engineering
Software Quality Analysis of Unlabeled Program Modules With Semisupervised Clustering
IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
Building a second opinion: learning cross-company data
Proceedings of the 9th International Conference on Predictive Models in Software Engineering
Hi-index | 0.00 |
Accurate detection of fault prone modules offers the path to high quality software products while minimizing non essential assurance expenditures. This type of quality modeling requires the availability of software modules with known fault content developed in similar environment. Establishing whether a module contains a fault or not can be expensive. The basic idea behind semi-supervised learning is to learn from a small number of software modules with known fault content and supplement model training with modules for which the fault information is not available. In this study, we investigate the performance of semi-supervised learning for software fault prediction. A preprocessing strategy, multidimensional scaling, is embedded in the approach to reduce the dimensional complexity of software metrics. Our results show that the semi-supervised learning algorithm with dimension-reduction preforms significantly better than one of the best performing supervised learning algorithms, random forest, in situations when few modules with known fault content are available for training.