Interpreting the data: Parallel analysis with Sawzall
Scientific Programming - Dynamic Grids and Worldwide Computing
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Hi-index | 0.00 |
Analyzing the wealth of information contained in software repositories requires significant expertise in mining techniques as well as a large infrastructure. In order to make this information more reachable for non-experts, we present the Boa language and infrastructure. Using Boa, these mining tasks are much simpler to write as the details are abstracted away. Boa programs also run on a distributed cluster to automatically provide massive parallelization to users and return results in minutes instead of potentially days.