The utility of exploiting idle workstations for parallel computation
SIGMETRICS '97 Proceedings of the 1997 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Using MPI (2nd ed.): portable parallel programming with the message-passing interface
Using MPI (2nd ed.): portable parallel programming with the message-passing interface
CCFinder: a multilinguistic token-based code clone detection system for large scale source code
IEEE Transactions on Software Engineering
Search Heuristics, Case-based Reasoning And Software Project Effort Prediction
GECCO '02 Proceedings of the Genetic and Evolutionary Computation Conference
Generating Robust Parsers using Island Grammars
WCRE '01 Proceedings of the Eighth Working Conference on Reverse Engineering (WCRE'01)
Understanding the Nature of Software Evolution
ICSM '03 Proceedings of the International Conference on Software Maintenance
Test input generation with java PathFinder
ISSTA '04 Proceedings of the 2004 ACM SIGSOFT international symposium on Software testing and analysis
Facilitating software evolution research with kenyon
Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering
Error detection by refactoring reconstruction
MSR '05 Proceedings of the 2005 international workshop on Mining software repositories
Mining software repositories to assist developers and support managers
Mining software repositories to assist developers and support managers
ICSE '07 Proceedings of the 29th international conference on Software Engineering
The Current State and Future of Search Based Software Engineering
FOSE '07 2007 Future of Software Engineering
Correlating Social Interactions to Release History during Software Evolution
MSR '07 Proceedings of the Fourth International Workshop on Mining Software Repositories
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
Journal of Software Maintenance and Evolution: Research and Practice
Extracting structural information from bug reports
Proceedings of the 2008 international working conference on Mining software repositories
Pig latin: a not-so-foreign language for data processing
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Comparison and evaluation of code clone detection techniques and tools: A qualitative approach
Science of Computer Programming
Macro-level software evolution: a case study of a large software compilation
Empirical Software Engineering
Sourcerer: An internet-scale software repository
SUITE '09 Proceedings of the 2009 ICSE Workshop on Search-Driven Development-Users, Infrastructure, Tools and Evaluation
MSR '09 Proceedings of the 2009 6th IEEE International Working Conference on Mining Software Repositories
MapReduce as a general framework to support research in Mining Software Repositories (MSR)
MSR '09 Proceedings of the 2009 6th IEEE International Working Conference on Mining Software Repositories
Does calling structure information improve the accuracy of fault prediction?
MSR '09 Proceedings of the 2009 6th IEEE International Working Conference on Mining Software Repositories
From work to word: How do software developers describe their work?
MSR '09 Proceedings of the 2009 6th IEEE International Working Conference on Mining Software Repositories
MapReduce and parallel DBMSs: friends or foes?
Communications of the ACM - Amir Pnueli: Ahead of His Time
Hadoop: The Definitive Guide
An empirical study on inconsistent changes to code clones at the release level
Science of Computer Programming
Journal of Systems and Software
Boa: a language and infrastructure for analyzing ultra-large-scale software repositories
Proceedings of the 2013 International Conference on Software Engineering
Hi-index | 0.00 |
The need for automated software engineering tools and techniques continues to grow as the size and complexity of studied systems and analysis techniques increase. Software engineering researchers often scale their analysis techniques using specialized one-off solutions, expensive infrastructures, or heuristic techniques (e.g., search-based approaches). However, such efforts are not reusable and are often costly to maintain. The need for scalable analysis is very prominent in the Mining Software Repositories (MSR) field, which specializes in the automated recovery and analysis of large data stored in software repositories. In this paper, we explore the scaling of automated software engineering analysis techniques by reusing scalable analysis platforms from the web field. We use three representative case studies from the MSR field to analyze the potential of the MapReduce platform to scale MSR tools with minimal effort. We document our experience such that other researchers could benefit from them. We find that many of the web field's guidelines for using the MapReduce platform need to be modified to better fit the characteristics of software engineering problems.