Awareness and coordination in shared workspaces
CSCW '92 Proceedings of the 1992 ACM conference on Computer-supported cooperative work
Building on the Basics: An Examination of High-Performance Computing Export Control Policy in the 1990s
Facilitating software evolution research with kenyon
Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering
Finding application errors and security flaws using PQL: a program query language
OOPSLA '05 Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Interpreting the data: Parallel analysis with Sawzall
Scientific Programming - Dynamic Grids and Worldwide Computing
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
How Long Will It Take to Fix This Bug?
MSR '07 Proceedings of the Fourth International Workshop on Mining Software Repositories
Dryad: distributed data-parallel programs from sequential building blocks
Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Pig latin: a not-so-foreign language for data processing
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Sourcerer: mining and searching internet-scale software repositories
Data Mining and Knowledge Discovery
FlumeJava: easy, efficient data-parallel pipelines
PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
An experience report on scaling tools for mining software repositories using MapReduce
Proceedings of the IEEE/ACM international conference on Automated software engineering
A study of the uniqueness of source code
Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering
The eval that men do: A large-scale study of the use of eval in javascript applications
Proceedings of the 25th European conference on Object-oriented programming
Empirical Software Engineering
Using alloy to support feature-based DSL construction for mining software repositories
Proceedings of the 17th International Software Product Line Conference co-located workshops
An adapter-aware, non-intrusive dependency injection framework for Java
Proceedings of the 2013 International Conference on Principles and Practices of Programming on the Java Platform: Virtual Machines, Languages, and Tools
Mining source code repositories with boa
Proceedings of the 2013 companion publication for conference on Systems, programming, & applications: software for humanity
Task fusion: improving utilization of multi-user clusters
Proceedings of the 2013 companion publication for conference on Systems, programming, & applications: software for humanity
Proceedings of the 12th international conference on Generative programming: concepts & experiences
A scalable crawler framework for FLOSS data
Proceedings of the 5th Asia-Pacific Symposium on Internetware
Hi-index | 0.00 |
In today's software-centric world, ultra-large-scale software repositories, e.g. SourceForge (350,000+ projects), GitHub (250,000+ projects), and Google Code (250,000+ projects) are the new library of Alexandria. They contain an enormous corpus of software and information about software. Scientists and engineers alike are interested in analyzing this wealth of information both for curiosity as well as for testing important hypotheses. However, systematic extraction of relevant data from these repositories and analysis of such data for testing hypotheses is hard, and best left for mining software repository (MSR) experts! The goal of Boa, a domain-specific language and infrastructure described here, is to ease testing MSR-related hypotheses. We have implemented Boa and provide a web-based interface to Boa's infrastructure. Our evaluation demonstrates that Boa substantially reduces programming efforts, thus lowering the barrier to entry. We also see drastic improvements in scalability. Last but not least, reproducing an experiment conducted using Boa is just a matter of re-running small Boa programs provided by previous researchers.