Codebook: discovering and exploiting relationships in software repositories

  • Authors:
  • Andrew Begel;Yit Phang Khoo;Thomas Zimmermann

  • Affiliations:
  • Microsoft Research, Redmond, WA;University of Maryland, College Park, MD;Microsoft Research, Redmond, WA

  • Venue:
  • Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Large-scale software engineering requires communication and collaboration to successfully build and ship products. We conducted a survey with Microsoft engineers on inter-team coordination and found that the most impactful problems concerned finding and keeping track of other engineers. Since engineers are connected by their shared work, a tool that discovers connections in their work-related repositories can help. Here we describe the Codebook framework for mining software repositories. It is flexible enough to address all of the problems identified by our survey with a single data structure (graph of people and artifacts) and a single algorithm (regular language reachability). Codebook handles a larger variety of problems than prior work, analyzes more kinds of work artifacts, and can be customized by and for end-users. To evaluate our framework's flexibility, we built two applications, Hoozizat and Deep Intellisense. We evaluated these applications with engineers to show effectiveness in addressing multiple inter-team coordination problems.