Sampling program quality

Authors:
Hongyu Zhang; Rongxin Wu
Affiliations:
School of Software, Tsinghua University, Beijing 100084, China;School of Software, Tsinghua University, Beijing 100084, China
Venue:
ICSM '10 Proceedings of the 2010 IEEE International Conference on Software Maintenance
Year:
2010

Citing 0
Cited 3

ReLink: recovering links between bugs and changes

Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering
Sample-based software defect prediction with active and semi-supervised learning

Automated Software Engineering
Predicting defect numbers based on defect state transition models

Proceedings of the ACM-IEEE international symposium on Empirical software engineering and measurement

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many modern software systems are large, consisting of hundreds or even thousands of programs (source files). Understanding the overall quality of these programs is a resource and time-consuming activity. It is desirable to have a quick yet accurate estimation of the overall program quality in a cost-effective manner. In this paper, we propose a sampling based approach - for a large software project, we only sample a small percentage of source files, and then estimate the quality of the entire programs in the project based on the characteristics of the sample. Through experiments on public defect datasets, we show that we can successfully estimate the total number of defects, proportions of defective programs, defect distributions, and defect-proneness - all from a small sample of programs. Our experiments also show that small samples can achieve similar prediction accuracies as larger samples do.