Application of K-Medoids with Kd-Tree for Software Fault Prediction
ACM SIGSOFT Software Engineering Notes
Hi-index | 0.00 |
Predicting the fault-proneness of program modules when the fault labels for modules are unavailable is a practical problem frequently encountered in the software industry. Because fault data belonging to previous software version is not available, supervised learning approaches can not be applied, leading to the need for new methods, tools, or techniques. In this study, we propose a clustering and metrics thresholds based software fault prediction approach for this challenging problem and explore it on three datasets, collected from a Turkish white-goods manufacturer developing embedded controller software. Experiments reveal that unsupervised software fault prediction can be automated and reasonable results can be produced with techniques based on metrics thresholds and clustering. The results of this study demonstrate the effectiveness of metrics thresholds and show that the standalone application of metrics thresholds (one-stage) is currently easier than the clustering and metrics thresholds based (two-stage) approach because the selection of cluster number is performed heuristically in this clustering based method.