Quality Assessment Based on Attribute Series of Software Evolution

  • Authors:
  • Jacek Ratzinger;Harald Gall;Martin Pinzger

  • Affiliations:
  • -;-;-

  • Venue:
  • WCRE '07 Proceedings of the 14th Working Conference on Reverse Engineering
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Defect density and defect prediction are essential for ef- ficient resource allocation in software evolution. In an em- pirical study we applied data mining techniques for value series based on evolution attributes such as number of au- thors, commit messages, lines of code, bug fix count, etc. Daily data points of these evolution attributes were cap- tured over a period of two months to predict the defects in the subsequent two months in a project. For that, we developed models utilizing genetic programming and lin- ear regression to accurately predict software defects. In our study, we investigated the data of three independent projects, two open source and one commercial software system. The results show that by utilizing series of these attributes we obtain models with high correlation coeffi- cients (between 0.716 and 0.946). Further, we argue that prediction models based on series of a single variable are sometimes superior to the model including all attributes: in contrast to other studies that resulted in size or complexity measures as predictors, we have identified the number of authors and the number of commit messages to versioning systems as excellent predictors of defect densities.