Analyzing ultra-large-scale code corpus with boa

Authors:
Robert Dyer;Hoan Nguyen;Hridesh Rajan;Tien Nguyen
Affiliations:
Iowa State University, Ames, IA, USA;Iowa State University, Ames, IA, USA;Iowa State University, Ames, IA, USA;Iowa State University, Ames, IA, USA
Venue:
Proceedings of the 3rd annual conference on Systems, programming, and applications: software for humanity
Year:
2012

Citing 2
Cited 0

Interpreting the data: Parallel analysis with Sawzall

Scientific Programming - Dynamic Grids and Worldwide Computing
MapReduce: simplified data processing on large clusters

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6

Quantified Score

Hi-index	0.00

Visualization

Abstract

Analyzing the wealth of information contained in software repositories requires significant expertise in mining techniques as well as a large infrastructure. In order to make this information more reachable for non-experts, we present the Boa language and infrastructure. Using Boa, these mining tasks are much simpler to write as the details are abstracted away. Boa programs also run on a distributed cluster to automatically provide massive parallelization to users and return results in minutes instead of potentially days.