A Fault Prediction Model with Limited Fault Data to Improve Test Process

  • Authors:
  • Cagatay Catal;Banu Diri

  • Affiliations:
  • The Scientific and Technological Research Council of TURKEY, Marmara Research Center, Information Technologies Institute, , Kocaeli, Turkey;Department of Computer Engineering, Yildiz Technical University, Istanbul, Turkey

  • Venue:
  • PROFES '08 Proceedings of the 9th international conference on Product-Focused Software Process Improvement
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Software fault prediction models are used to identify the fault-prone software modules and produce reliable software. Performance of a software fault prediction model is correlated with available software metrics and fault data. In some occasions, there may be few software modules having fault data and therefore, prediction models using only labeled data can not provide accurate results. Semi-supervised learning approaches which benefit from unlabeled and labeled data may be applied in this case. In this paper, we propose an artificial immune system based semi-supervised learning approach. Proposed approach uses a recent semi-supervised algorithm called YATSI (Yet Another Two Stage Idea) and in the first stage of YATSI, AIRS (Artificial Immune Recognition Systems) is applied. In addition, AIRS, RF (Random Forests) classifier, AIRS based YATSI, and RF based YATSI are benchmarked. Experimental results showed that while YATSI algorithm improved the performance of AIRS, it diminished the performance of RF for unbalanced datasets. Furthermore, performance of AIRS based YATSI is comparable with RF which is the best machine learning classifier according to some researches.