Benchmarking Lightweight Techniques to Link E-Mails and Source Code

  • Authors:
  • Alberto Bacchelli;Marco D'Ambros;Michele Lanza;Romain Robbes

  • Affiliations:
  • -;-;-;-

  • Venue:
  • WCRE '09 Proceedings of the 2009 16th Working Conference on Reverse Engineering
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

During the evolution of a software system, a large amount of information, which is not always directly related to the source code, is produced. Several researchers have provided evidence that the contents of mailing lists represent a valuable source of information: Through e-mails, developers discuss design decisions, ideas, known problems and bugs, etc. which are otherwise not to be found in the system.A technical challenge in this context is how to establish the missing link between free-form e-mails and the system artifacts they refer to. Although the range of approaches is vast, establishing their accuracy remains a problem, as there is no benchmark against which to compare their performance.To overcome this issue, we manually inspected a statistically significant number of e-mails pertaining to the ArgoUML system. Based on this benchmark, we present a variety of lightweight techniques to assign e-mails to software artifacts and measure their effectiveness in terms of precision and recall.