Leveraging usage similarity for effective retrieval of examples in code repositories

  • Authors:
  • Sushil K. Bajracharya;Joel Ossher;Cristina V. Lopes

  • Affiliations:
  • University of California, Irvine, Irvine, CA, USA;University of California, Irvine, Irvine, CA, USA;University of California, Irvine, Irvine, CA, USA

  • Venue:
  • Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Developers often learn to use APIs (Application Programming Interfaces) by looking at existing examples of API usage. Code repositories contain many instances of such usage of APIs. However, conventional information retrieval techniques fail to perform well in retrieving API usage examples from code repositories. This paper presents Structural Semantic Indexing (SSI), a technique to associate words to source code entities based on similarities of API usage. The heuristic behind this technique is that entities (classes, methods, etc.) that show similar uses of APIs are semantically related because they do similar things. We evaluate the effectiveness of SSI in code retrieval by comparing three SSI based retrieval schemes with two conventional baseline schemes. We evaluate the performance of the retrieval schemes by running a set of 20 candidate queries against a repository containing 222,397 source code entities from 346 jars belonging to the Eclipse framework. The results of the evaluation show that SSI is effective in improving the retrieval of examples in code repositories.