Mining sequences of changed-files from version histories

  • Authors:
  • Huzefa Kagdi;Shehnaaz Yusuf;Jonathan I. Maletic

  • Affiliations:
  • Kent State University, Kent, Ohio;Kent State University, Kent, Ohio;Kent State University, Kent, Ohio

  • Venue:
  • Proceedings of the 2006 international workshop on Mining software repositories
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Modern source-control systems, such as Subversion, preserve change-sets of files as atomic commits. However, the specific ordering information in which files were changed is typically not found in these source-code repositories. In this paper, a set of heuristics for grouping change-sets (i.e., log-entries) found in source-code repositories is presented. Given such groups of change-sets, sequences of files that frequently change together are uncovered. This approach not only gives the (unordered) sets of files but supplements them with (partial temporal) ordering information. The technique is demonstrated on a subset of KDE source-code repository. The results show that the approach is able to find sequences of changed-files.