Mining software repositories using topic models

  • Authors:
  • Stephen W. Thomas

  • Affiliations:
  • Queen's University, Kingston, ON, Canada

  • Venue:
  • Proceedings of the 33rd International Conference on Software Engineering
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Software repositories, such as source code, email archives, and bug databases, contain unstructured and unlabeled text that is difficult to analyze with traditional techniques. We propose the use of statistical topic models to automatically discover structure in these textual repositories. This discovered structure has the potential to be used in software engineering tasks, such as bug prediction and traceability link recovery. Our research goal is to address the challenges of applying topic models to software repositories.