Applying a smoothing filter to improve IR-based traceability recovery processes: An empirical investigation

  • Authors:
  • Andrea De Lucia;Massimiliano Di Penta;Rocco Oliveto;Annibale Panichella;Sebastiano Panichella

  • Affiliations:
  • University of Salerno, Via Ponte don Melillo, 84084 Fisciano (SA), Italy;University of Sannio, Viale Traiano, 82100 Benevento, Italy;University of Molise, C.da Fonte Lappone, 86090 Pesche (IS), Italy;University of Salerno, Via Ponte don Melillo, 84084 Fisciano (SA), Italy;University of Sannio, Viale Traiano, 82100 Benevento, Italy

  • Venue:
  • Information and Software Technology
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Context: Traceability relations among software artifacts often tend to be missing, outdated, or lost. For this reason, various traceability recovery approaches-based on Information Retrieval (IR) techniques-have been proposed. The performances of such approaches are often influenced by ''noise'' contained in software artifacts (e.g., recurring words in document templates or other words that do not contribute to the retrieval itself). Aim: As a complement and alternative to stop word removal approaches, this paper proposes the use of a smoothing filter to remove ''noise'' from the textual corpus of artifacts to be traced. Method: We evaluate the effect of a smoothing filter in traceability recovery tasks involving different kinds of artifacts from five software projects, and applying three different IR methods, namely Vector Space Models, Latent Semantic Indexing, and Jensen-Shannon similarity model. Results: Our study indicates that, with the exception of some specific kinds of artifacts (i.e., tracing test cases to source code) the proposed approach is able to significantly improve the performances of traceability recovery, and to remove ''noise'' that simple stop word filters cannot remove. Conclusions: The obtained results not only help to develop traceability recovery approaches able to work in presence of noisy artifacts, but also suggest that smoothing filters can be used to improve performances of other software engineering approaches based on textual analysis.