Source Code Retrieval for Bug Localization Using Latent Dirichlet Allocation

  • Authors:
  • Stacy K. Lukins;Nicholas A. Kraft;Letha H. Etzkorn

  • Affiliations:
  • -;-;-

  • Venue:
  • WCRE '08 Proceedings of the 2008 15th Working Conference on Reverse Engineering
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In bug localization, a developer uses information about a bug to locate the portion of the source code to modify to correct the bug. Developers expend considerable effort performing this task. Some recent static techniques for automatic bug localization have been built around modern information retrieval (IR) models such as latent semantic indexing (LSI); however, latent Dirichlet allocation (LDA), a modular and extensible IR model, has significant advantages over both LSI and probabilistic LSI (pLSI). In this paper we present an LDA-based static technique for automating bug localization. We describe the implementation of our technique and three case studies that measure its effectiveness. For two of the case studies we directly compare our results to those from similar studies performed using LSI. The results demonstrate our LDA-based technique performs at least as well as the LSI-based techniques for all bugs and performs better, often significantly so, than the LSI-based techniques for most bugs.