Concept location using formal concept analysis and information retrieval

  • Authors:
  • Denys Poshyvanyk;Malcom Gethers;Andrian Marcus

  • Affiliations:
  • College of William and Mary;College of William and Mary;Wayne State University

  • Venue:
  • ACM Transactions on Software Engineering and Methodology (TOSEM)
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

The article addresses the problem of concept location in source code by proposing an approach that combines Formal Concept Analysis and Information Retrieval. In the proposed approach, Latent Semantic Indexing, an advanced Information Retrieval approach, is used to map textual descriptions of software features or bug reports to relevant parts of the source code, presented as a ranked list of source code elements. Given the ranked list, the approach selects the most relevant attributes from the best ranked documents, clusters the results, and presents them as a concept lattice, generated using Formal Concept Analysis. The approach is evaluated through a large case study on concept location in the source code on six open-source systems, using several hundred features and bugs. The empirical study focuses on the analysis of various configurations of the generated concept lattices and the results indicate that our approach is effective in organizing different concepts and their relationships present in the subset of the search results. In consequence, the proposed concept location method has been shown to outperform a standalone Information Retrieval based concept location technique by reducing the number of irrelevant search results across all the systems and lattice configurations evaluated, potentially reducing the programmers' effort during software maintenance tasks involving concept location.