Recommending random walks

  • Authors:
  • Zachary M. Saul;Vladimir Filkov;Premkumar Devanbu;Christian Bird

  • Affiliations:
  • University of California: Davis, Davis, CA;University of California: Davis, Davis, CA;University of California: Davis, Davis, CA;University of California: Davis, Davis, CA

  • Venue:
  • Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

We improve on previous recommender systems by taking advantage of the layered structure of software. We use a random-walk approach, mimicking the more focused behavior of a developer, who browses the caller-callee links in the callgraph of a large program, seeking routines that are likely to be related to a function of interest. Inspired by Kleinberg's work [10], we approximate the steady-state of an infinite random walk on a subset of a callgraph in order to rank the functions by their steady-state probabilities. Surprisingly, this purely structural approach works quite well. Our approach, like that of Robillard's "Suade" algorithm [15], and earlier data mining approaches [13] relies solely on the always available current state of the code, rather than other sources such as comments, documentation or revision information. Using the Apache API documentation as an oracle, we perform a quantitative evaluation of our method, finding that our algorithm dramatically improves upon Suade in this setting. We also find that the performance of traditional data mining approaches is complementary to ours; this leads naturally to an evidence-based combination of the two, which shows excellent performance on this task.