Interactive churn metrics: socio-technical variants of code churn

  • Authors:
  • Andrew Meneely;Oluyinka Williams

  • Affiliations:
  • Department of Software Engineering Rochester Institute of Technology, Rochester, NY;Department of Software Engineering Rochester Institute of Technology, Rochester, NY

  • Venue:
  • ACM SIGSOFT Software Engineering Notes
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

A central part of software quality is finding bugs. One method of finding bugs is by measuring important aspects of the software product and the development process. In recent history, researchers have discovered evidence of a "code churn" effect whereby the degree to which a given source code file has changed over time is correlated with faults and vulnerabilities. Computing the code churn metric comes from counting source code differences in version control repositories. However, code churn does not take into account a critical factor of any software development team: the human factor, specifically who is making the changes. In this paper, we introduce a new class of human-centered metrics, "interactive churn metrics" as variants of code churn. Using the git blame tool, we identify the most recent developer who changed a given line of code in a file prior to a given revision. Then, for each line changed in a given revision, determined if the revision author was changing his or her own code ("self churn"), or the author was changing code last modified by somebody else ("interactive churn"). We derive and present several metrics from this concept. Finally, we conducted an empirical analysis of these metrics on the PHP programming language and its post-release vulnerabilities. We found that our interactive churn metrics are statistically correlated with post-release vulnerabilities and only weakly correlated with code churn metrics and source lines of code. The results indicate that interactive churn metrics are associated with software quality and are different from the code churn and source lines of code.