Comparing case-based reasoning classifiers for predicting high risk software components
Journal of Systems and Software
A Decision Criterion for the Optimal Number of Clusters in Hierarchical Clustering
Journal of Global Optimization
Software Quality Classification Modeling Using The SPRINT Decision Tree Algorithm
ICTAI '02 Proceedings of the 14th IEEE International Conference on Tools with Artificial Intelligence
Analyzing Software Measurement Data with Clustering Techniques
IEEE Intelligent Systems
Detection Strategies: Metrics-Based Rules for Detecting Design Flaws
ICSM '04 Proceedings of the 20th IEEE International Conference on Software Maintenance
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
Analyzing Software Quality with Limited Fault-Proneness Defect Data
HASE '05 Proceedings of the Ninth IEEE International Symposium on High-Assurance Systems Engineering
Maxdiff kd-trees for data condensation
Pattern Recognition Letters
A method for initialising the K-means clustering algorithm using kd-trees
Pattern Recognition Letters
Computational Geometry: Algorithms and Applications
Computational Geometry: Algorithms and Applications
Efficient Bisecting k-Medoids and Its Application in Gene Expression Analysis
ICIAR '08 Proceedings of the 5th international conference on Image Analysis and Recognition
Clustering and Metrics Thresholds Based Software Fault Prediction of Unlabeled Program Modules
ITNG '09 Proceedings of the 2009 Sixth International Conference on Information Technology: New Generations
Unsupervised learning for expert-based software quality estimation
HASE'04 Proceedings of the Eighth IEEE international conference on High assurance systems engineering
Hi-index | 0.00 |
Software fault prediction area is subject to problems like non availability of fault data which makes the application of supervised techniques difficult. In such cases unsupervised approaches like clustering are helpful. In this paper, K-Medoids clustering approach has been applied for software fault prediction. To overcome the inherent computational complexity of KMedoids algorithm a data structure called Kd-Tree has been used to identify data agents in the datasets. Partitioning Around Medoids is applied on these data agents and this results in a set of medoids. All the remaining data points are assigned to the nearest medoids thus obtained to get the final clusters. Software fault prediction error analysis results show that our approach outperforms all unsupervised approaches in the case of one given real dataset and gives best values for the evaluation parameters. For other real datasets, our results are comparable to other techniques. Performance evaluation of our technique with other techniques has been done. Results show that our technique reduces the total number of distance calculations drastically since the number of data agents is much less than the number of data points.