Multi-GPU MapReduce on GPU Clusters

Authors:
Jeff A. Stuart;John D. Owens
Affiliations:
-;-
Venue:
IPDPS '11 Proceedings of the 2011 IEEE International Parallel & Distributed Processing Symposium
Year:
2011

Citing 0
Cited 13

Simplified parallel domain traversal

Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Introducing scalable quantum approaches in language representation

QI'11 Proceedings of the 5th international conference on Quantum interaction
Scalable framework for mapping streaming applications onto multi-GPU systems

Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
FLAT: a GPU programming framework to provide embedded MPI

Proceedings of the 5th Annual Workshop on General Purpose Processing with Graphics Processing Units
Optimizing MapReduce for GPUs with effective shared memory usage

Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
A Map-Reduce Based Framework for Heterogeneous Processing Element Cluster Environments

CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Accelerating MapReduce on a coupled CPU-GPU architecture

SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Multi-core and many-core shared-memory parallel raycasting volume rendering optimization and tuning

International Journal of High Performance Computing Applications
Accelerating text mining workloads in a MapReduce-based distributed GPU environment

Journal of Parallel and Distributed Computing
Grex: An efficient MapReduce framework for graphics processing units

Journal of Parallel and Distributed Computing
Comparison based sorting for systems with multiple GPUs

Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units
Scaling large-data computations on multi-GPU accelerators

Proceedings of the 27th international ACM conference on International conference on supercomputing
Accelerate MapReduce on GPUs with multi-level reduction

Proceedings of the 5th Asia-Pacific Symposium on Internetware

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present GPMR, our stand-alone MapReduce library that leverages the power of GPU clusters for large-scale computing. To better utilize the GPU, we modify MapReduce by combining large amounts of map and reduce items into chunks and using partial reductions and accumulation. We use persistent map and reduce tasks and stress aspects of GPMR with a set of standard MapReduce benchmarks. We run these benchmarks on a GPU cluster and achieve desirable speedup and efficiency for all benchmarks. We compare our implementation to the current-best GPU-MapReduce library (runs only on a solo GPU) and a highly-optimized multi-core MapReduce to show the power of GPMR. We demonstrate how typical MapReduce tasks are easily modified to fit into GPMR and leverage a GPU cluster. We highlight how total and relative amounts of communication affect GPMR. We conclude with an exposition on the types of MapReduce tasks well-suited to GPMR, and why some tasks need more modifications than others to work well with GPMR.