An Implementation of GPU Accelerated MapReduce: Using Hadoop with OpenCL for Data- and Compute-Intensive Jobs

  • Authors:
  • Miao Xin;Hao Li

  • Affiliations:
  • -;-

  • Venue:
  • IJCSS '12 Proceedings of the 2012 International Joint Conference on Service Sciences
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

MapReduce is an efficient distributed computing model for large-scale data processing. However, single-node performance is gradually to be the bottleneck in compute-intensive jobs. This paper presents an approach of MapReduce improvement with GPU acceleration, which is implemented by Hadoop and OpenCL. Different from other implementations, it targets at general and inexpensive hardware platform, and it is seamless-integrated with Apache Hadoop, a most widely used MapReduce framework. As a heterogeneous multi-machine and multicore architecture, it aims at both data- and compute-intensive applications. An almost 2 times performance improvement has been validated, without any farther optimization.