Mars: Accelerating MapReduce with Graphics Processors

Authors:
Wenbin Fang;Bingsheng He;Qiong Luo;Naga K. Govindaraju
Affiliations:
University of Wisconsin-Madison, Madison;Nanyang Technological University, Singapore;Hong Kong University of Science and Technology, Hong Kong;Microsoft Corp., Redmond
Venue:
IEEE Transactions on Parallel and Distributed Systems
Year:
2011

Citing 0
Cited 8

Many-Core architecture oriented parallel algorithm design for computer animation

MIG'11 Proceedings of the 4th international conference on Motion in Games
A Map-Reduce Based Framework for Heterogeneous Processing Element Cluster Environments

CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Maestro: Replica-Aware Map Scheduling for MapReduce

CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Cogset: a high performance MapReduce engine

Concurrency and Computation: Practice & Experience
CAP: co-scheduling based on asymptotic profiling in CPU+GPU hybrid systems

Proceedings of the 2013 International Workshop on Programming Models and Applications for Multicores and Manycores
MRSG - A MapReduce simulator over SimGrid

Parallel Computing
HAT: history-based auto-tuning MapReduce in heterogeneous environments

The Journal of Supercomputing
A MapReduce task scheduling algorithm for deadline constraints

Cluster Computing

Quantified Score

Hi-index	0.01

Visualization

Abstract

We design and implement Mars, a MapReduce runtime system accelerated with graphics processing units (GPUs). MapReduce is a simple and flexible parallel programming paradigm originally proposed by Google, for the ease of large-scale data processing on thousands of CPUs. Compared with CPUs, GPUs have an order of magnitude higher computation power and memory bandwidth. However, GPUs are designed as special-purpose coprocessors and their programming interfaces are less familiar than those on the CPUs to MapReduce programmers. To harness GPUs' power for MapReduce, we developed Mars to run on NVIDIA GPUs, AMD GPUs as well as multicore CPUs. Furthermore, we integrated Mars into Hadoop, an open-source CPU-based MapReduce system. Mars hides the programming complexity of GPUs behind the simple and familiar MapReduce interface, and automatically manages task partitioning, data distribution, and parallelization on the processors. We have implemented six representative applications on Mars and evaluated their performance on PCs equipped with GPUs as well as multicore CPUs. The experimental results show that, the GPU-CPU coprocessing of Mars on an NVIDIA GTX280 GPU and an Intel quad-core CPU outperformed Phoenix, the state-of-the-art MapReduce on the multicore CPU with a speedup of up to 72 times and 24 times on average, depending on the applications. Additionally, integrating Mars into Hadoop enabled GPU acceleration for a network of PCs.