Software defect detection with rocus

  • Authors:
  • Yuan Jiang;Ming Li;Zhi-Hua Zhou

  • Affiliations:
  • National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China;National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China;National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China

  • Venue:
  • Journal of Computer Science and Technology
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Software defect detection aims to automatically identify defective software modules for efficient software test in order to improve the quality of a software system. Although many machine learning methods have been successfully applied to the task, most of them fail to consider two practical yet important issues in software defect detection. First, it is rather difficult to collect a large amount of labeled training data for learning a well-performing model; second, in a software system there are usually much fewer defective modules than defect-free modules, so learning would have to be conducted over an imbalanced data set. In this paper, we address these two practical issues simultaneously by prcposing a novel semi-supervised learning approach named ROCUS. This method exploits the abundant unlabeled examples to improve the detection accuracy, as well as employs under-sampling to tackle the class-imbalance problem in the learning process. Experimental results of real-world software defect detection tasks show that ROCUS is effective for software defect cetection. Its performance is better than a semi-supervised learning method that ignores the class-imbalance nature of the task and a class-imbalance learning method that does not make effective use of unlabeled data.