Generalizing mapreduce as a unified cloud and HPC runtime

Authors:
Judy Qiu
Affiliations:
Indiana University, Bloomington, IN, USA
Venue:
Proceedings of the 2nd international workshop on Petascal data analytics: challenges and opportunities
Year:
2011

Citing 3
Cited 0

Adaptable, metadata rich IO methods for portable high performance IO

IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
Twister: a runtime for iterative MapReduce

Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Portable Parallel Programming on Cloud and HPC: Scientific Applications of Twister4Azure

UCC '11 Proceedings of the 2011 Fourth IEEE International Conference on Utility and Cloud Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Computational simulation and analysis were one of the keys to the future in data-intensive science as a "fourth paradigm" of scientific discovery but facing a major challenge as handling the incredible increases in dataset sizes. This requires attractive powerful programming models that address issues of portability with scaling performance and fault tolerance. Further, one must meet these challenges for both computation and storage. We build on the success of our research on Iterative MapReduce with successful prototypes Twister (on HPC) and Twister4Azure (on clouds). We have designed a novel Map Collective runtime which generalizes previous work in both HPC and MapReduce communities, which we hypothesize can be used as the runtime for data analysis (mining) interoperably between clouds and clusters.