How Well Do Search Engines Support Code Retrieval on the Web?

Authors:
Susan Elliott Sim;Medha Umarji;Sukanya Ratanotayanon;Cristina V. Lopes
Affiliations:
University of California, Irvine;University of Maryland, Baltimore County;University of California, Irvine;University of California, Irvine
Venue:
ACM Transactions on Software Engineering and Methodology (TOSEM)
Year:
2011

Citing 41
Cited 9

No Silver Bullet Essence and Accidents of Software Engineering

Computer
The C Information Abstraction System

IEEE Transactions on Software Engineering
Implementing faceted classification for software reuse

Communications of the ACM - Special issue on software engineering
Why are online catalogs still hard to use?

Journal of the American Society for Information Science - Special issue: current research in online public access systems
Lightweight lexical source model extraction

ACM Transactions on Software Engineering and Methodology (TOSEM)
Cognitive tools for locating and comprehending software objects for reuse

ICSE '91 Proceedings of the 13th international conference on Software engineering
Specification matching of software components

ACM Transactions on Software Engineering and Methodology (TOSEM)
Programming Techniques: Regular expression search algorithm

Communications of the ACM
Supporting reuse by delivering task-relevant and personalized information

Proceedings of the 24th International Conference on Software Engineering
A survey of software reuse libraries

Annals of Software Engineering
Weaving Together Requirements and Architectures

Computer
A Framework for Source Code Search Using Program Patterns

IEEE Transactions on Software Engineering
Implementing Regular Tree Expressions

Proceedings of the 5th ACM Conference on Functional Programming Languages and Computer Architecture
Implementing relational views of programs

SDE 1 Proceedings of the first ACM SIGSOFT/SIGPLAN software engineering symposium on Practical software development environments
QBO: A Query Tool Specially Developed to Explore Programs

WCRE '99 Proceedings of the Sixth Working Conference on Reverse Engineering
Archetypal Source Code Searches: A Survey of Software Developers and Maintainers

IWPC '98 Proceedings of the 6th International Workshop on Program Comprehension
Software reuse strategies and component markets

Communications of the ACM - Program compaction
Guest Editors' Introduction: How Is Open Source Affecting Software Development?

IEEE Software
An Information Retrieval Approach to Concept Location in Source Code

WCRE '04 Proceedings of the 11th Working Conference on Reverse Engineering
Using structural context to recommend source code examples

Proceedings of the 27th international conference on Software engineering
Jungloid mining: helping to navigate the API jungle

Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Mining Version Histories to Guide Software Changes

IEEE Transactions on Software Engineering
Strathcona example recommendation tool

Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering
Software Reuse Research: Status and Future

IEEE Transactions on Software Engineering
Micro patterns in Java code

OOPSLA '05 Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Relevance judgment: What do information users consider beyond topicality?

Journal of the American Society for Information Science and Technology - Research Articles
A view of 20th and 21st century software engineering

Proceedings of the 28th international conference on Software engineering
Google's PageRank and Beyond: The Science of Search Engine Rankings

Google's PageRank and Beyond: The Science of Search Engine Rankings
XSnippet: mining For sample code

Proceedings of the 21st annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications
Supporting the Investigation and Planning of Pragmatic Reuse Tasks

ICSE '07 Proceedings of the 29th international conference on Software Engineering
Finding Relevant Applications for Prototyping

MSR '07 Proceedings of the Fourth International Workshop on Mining Software Repositories
Systematic review: A systematic review of effect size in software engineering experiments

Information and Software Technology
Assieme: finding and leveraging implicit references in a web search interface for programmers

Proceedings of the 20th annual ACM symposium on User interface software and technology
CodeGenie:: a tool for test-driven source code search

Companion to the 22nd ACM SIGPLAN conference on Object-oriented programming systems and applications companion
Relevance: A review of the literature and a framework for thinking on the notion in information science. Part III: Behavior and effects of relevance

Journal of the American Society for Information Science and Technology
Relevance: A review of the literature and a framework for thinking on the notion in information science. Part II: nature and manifestations of relevance

Journal of the American Society for Information Science and Technology
Parseweb: a programmer assistant for reusing open source code on the web

Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering
AMAP: automatically mining abbreviation expansions in programs to enhance software maintenance tools

Proceedings of the 2008 international working conference on Mining software repositories
Introduction to Information Retrieval

Introduction to Information Retrieval
Lightweight, Semi-automated Enactment of Pragmatic-Reuse Plans

ICSR '08 Proceedings of the 10th international conference on Software Reuse: High Confidence Software Reuse in Large Systems
Code Conjurer: Pulling Reusable Software out of Thin Air

IEEE Software

Searching for reputable source code on the web

Proceedings of the 16th ACM international conference on Supporting group work
What do developers search for in source code and why

Proceedings of the 3rd International Workshop on Search-Driven Development: Users, Infrastructure, Tools, and Evaluation
What kinds of development problems can be solved by searching the web?: a field study

Proceedings of the 3rd International Workshop on Search-Driven Development: Users, Infrastructure, Tools, and Evaluation
Software reuse through methodical component reuse and amethodical snippet remixing

Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work
Minersoft: Software retrieval in grid and cloud computing infrastructures

ACM Transactions on Internet Technology (TOIT)
Finding suitable programs: semantic search with incomplete and lightweight specifications

Proceedings of the 34th International Conference on Software Engineering
Toward semantic search via SMT solver

Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering
SNIPR: complementing code search with code retargeting capabilities

Proceedings of the 2013 International Conference on Software Engineering
Portfolio: Searching for relevant functions and their usages in millions of lines of code

ACM Transactions on Software Engineering and Methodology (TOSEM) - Testing, debugging, and error handling, formal methods, lifecycle concerns, evolution and maintenance

Quantified Score

Hi-index	0.00

Visualization

Abstract

Software developers search the Web for various kinds of source code for diverse reasons. In a previous study, we found that searches varied along two dimensions: the size of the search target (e.g., block, subsystem, or system) and the motivation for the search (e.g., reference example or as-is reuse). Would each of these kinds of searches require different search technologies? To answer this question, we conducted an experiment with 36 participants to evaluate three diverse approaches (general purpose information retrieval, source code search, and component reuse), as represented by five Web sites (Google, Koders, Krugle, Google Code Search, and SourceForge). The independent variables were search engine, size of search target, and motivation for search. The dependent variable was the participants judgement of the relevance of the first ten hits. We found that it was easier to find reference examples than components for as-is reuse and that participants obtained the best results using a general-purpose information retrieval site. However, we also found an interaction effect: code-specific search engines worked better in searches for subsystems, but Google worked better on searches for blocks. These results can be used to guide the creation of new tools for retrieving source code from the Web.