Map-reduce as a Programming Model for Custom Computing Machines

Authors:
Jackson H. C. Yeung;C. C. Tsang;K. H. Tsoi;Bill S. H. Kwan;Chris C. C. Cheung;Anthony P. C. Chan;Philip H. W. Leong
Affiliations:
-;-;-;-;-;-;-
Venue:
FCCM '08 Proceedings of the 2008 16th International Symposium on Field-Programmable Custom Computing Machines
Year:
2008

Citing 0
Cited 10

Implementing Parallel Google Map-Reduce in Eden

Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
FPMR: MapReduce framework on FPGA

Proceedings of the 18th annual ACM/SIGDA international symposium on Field programmable gate arrays
Axel: a heterogeneous cluster with FPGAs and GPUs

Proceedings of the 18th annual ACM/SIGDA international symposium on Field programmable gate arrays
High-Performance Quasi-Monte Carlo Financial Simulation: FPGA vs. GPP vs. GPU

ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Combining optimizations in automated low power design

Proceedings of the Conference on Design, Automation and Test in Europe
The RLOC is dead - long live the RLOC

Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays
Dynamic workload balancing deques for branch and bound algorithms in the message passing interface

International Journal of High Performance Systems Architecture
Automated Mapping of the MapReduce Pattern onto Parallel Computing Platforms

Journal of Signal Processing Systems
Mapping a data-flow programming model onto heterogeneous platforms

Proceedings of the 13th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, Tools and Theory for Embedded Systems
A dynamic self-scheduling scheme for heterogeneous multiprocessor architectures

ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers

Quantified Score

Hi-index	0.00

Visualization

Abstract

The map-reduce model requires users to express their problem in terms of a map function that processes single records in a stream, and a reduce function that merges all mapped outputs to produce a final result. By exposing structural similarity in this way, a number of key issues associated with the design of custom computing machines including parallelisation; design complexity; software-hardware partitioning; hardware-dependency, portability and scalability can be easily addressed. We present an implementation of a map-reduce library supporting parallel field programmable gate arrays (FPGAs) and graphics processing units (GPUs). Parallelisation due to pipelining, multiple data paths and concurrent execution of FPGA/GPU hardware is automatically achieved. Users first specify the map and reduce steps for the problem in ANSI Cand no knowledge of the underlying hardware or parallelisation is needed. The source code is then manually translated into a pipelined data path which, along with the map-reduce library, is compiled into appropriate binary configurations for the processing units. We describe our experience in developing a number of benchmark problems in signal processing, Monte Carlo simulation and scientific computing as well as report on the performance of FPGA, GPU and hetereogeneous systems.