How Well Do Search Engines Support Code Retrieval on the Web?

  • Authors:
  • Susan Elliott Sim;Medha Umarji;Sukanya Ratanotayanon;Cristina V. Lopes

  • Affiliations:
  • University of California, Irvine;University of Maryland, Baltimore County;University of California, Irvine;University of California, Irvine

  • Venue:
  • ACM Transactions on Software Engineering and Methodology (TOSEM)
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Software developers search the Web for various kinds of source code for diverse reasons. In a previous study, we found that searches varied along two dimensions: the size of the search target (e.g., block, subsystem, or system) and the motivation for the search (e.g., reference example or as-is reuse). Would each of these kinds of searches require different search technologies? To answer this question, we conducted an experiment with 36 participants to evaluate three diverse approaches (general purpose information retrieval, source code search, and component reuse), as represented by five Web sites (Google, Koders, Krugle, Google Code Search, and SourceForge). The independent variables were search engine, size of search target, and motivation for search. The dependent variable was the participants judgement of the relevance of the first ten hits. We found that it was easier to find reference examples than components for as-is reuse and that participants obtained the best results using a general-purpose information retrieval site. However, we also found an interaction effect: code-specific search engines worked better in searches for subsystems, but Google worked better on searches for blocks. These results can be used to guide the creation of new tools for retrieving source code from the Web.