Programming framework for clusters with heterogeneous accelerators

Authors:
Kuen Hung Tsoi;Anson H.T. Tse;Peter Pietzuch;Wayne Luk
Affiliations:
Imperial College London;Imperial College London;Imperial College London;Imperial College London
Venue:
ACM SIGARCH Computer Architecture News
Year:
2011

Citing 9
Cited 3

Filtering algorithms and implementation for very fast publish/subscribe systems

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
The many faces of publish/subscribe

ACM Computing Surveys (CSUR)
The GRAPE Project

Computing in Science and Engineering
Maxwell - a 64 FPGA Supercomputer

AHS '07 Proceedings of the Second NASA/ESA Conference on Adaptive Hardware and Systems
The FPGA High-Performance Computing Alliance Parallel Toolkit

AHS '07 Proceedings of the Second NASA/ESA Conference on Adaptive Hardware and Systems
Phoenix: A Runtime Environment for High Performance Computing on Chip Multiprocessors

PDP '09 Proceedings of the 2009 17th Euromicro International Conference on Parallel, Distributed and Network-based Processing
Accelerating Quadrature Methods for Option Valuation

FCCM '09 Proceedings of the 2009 17th IEEE Symposium on Field Programmable Custom Computing Machines
Axel: a heterogeneous cluster with FPGAs and GPUs

Proceedings of the 18th annual ACM/SIGDA international symposium on Field programmable gate arrays
High-Performance Quasi-Monte Carlo Financial Simulation: FPGA vs. GPP vs. GPU

ACM Transactions on Reconfigurable Technology and Systems (TRETS)

FLAT: a GPU programming framework to provide embedded MPI

Proceedings of the 5th Annual Workshop on General Purpose Processing with Graphics Processing Units
Investigation into scaling I/O bound streaming applications productively with an all-FPGA cluster

Parallel Computing
A dynamic self-scheduling scheme for heterogeneous multiprocessor architectures

ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers

Quantified Score

Hi-index	0.00

Visualization

Abstract

We describe a programming framework for high performance clusters with various hardware accelerators. In this framework, users can utilize the available heterogeneous resources productively and efficiently. The distributed application is highly modularized to support dynamic system configuration with changing types and number of the accelerators. Multiple layers of communication interface are introduced to reduce the overhead in both control messages and data transfers. Parallelism can be achieved by controlling the accelerators in various schemes through scheduling extension. The framework has been used to support physics simulation and financial application development. We achieve significant performance improvement on a 16-node cluster with FPGA and GPU accelerators.