Comparing Hardware Accelerators in Scientific Applications: A Case Study

Authors:
Rick Weber;Akila Gothandaraman;Robert J. Hinde;Gregory D. Peterson
Affiliations:
University of Tennessee, Knoxville;University of Pittsburgh, Pittsburgh;University of Tennessee, Knoxville;University of Tennessee, Knoxville
Venue:
IEEE Transactions on Parallel and Distributed Systems
Year:
2011

Citing 0
Cited 13

A code-based analytical approach for using separate device coprocessors in computing systems

ARCS'11 Proceedings of the 24th international conference on Architecture of computing systems
In search of numerical consistency in parallel programming

Parallel Computing
Seamlessly portable applications: Managing the diversity of modern heterogeneous systems

ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
From CUDA to OpenCL: Towards a performance-portable solution for multi-platform GPU programming

Parallel Computing
Tuning solution of large non-Hermitian linear systems on multiple graphics processing unit accelerated workstations

International Journal of High Performance Computing Applications
Comparing CUDA, OpenCL and OpenGL implementations of the cardiac monodomain equations

PPAM'11 Proceedings of the 9th international conference on Parallel Processing and Applied Mathematics - Volume Part II
Three-dimensional thinning algorithms on graphics processing units and multicore CPUs

Concurrency and Computation: Practice & Experience
Optimizing Techniques for OpenCL Programs on Heterogeneous Platforms

International Journal of Grid and High Performance Computing
Glinda: a framework for accelerating imbalanced applications on heterogeneous platforms

Proceedings of the ACM International Conference on Computing Frontiers
Parallel unsupervised Synthetic Aperture Radar image change detection on a graphics processing unit

International Journal of High Performance Computing Applications
Box-counting algorithm on GPU and multi-core CPU: an OpenCL cross-platform study

The Journal of Supercomputing
An investigation of the performance portability of OpenCL

Journal of Parallel and Distributed Computing
Optimising space exploration of OpenCL for GPGPUs

International Journal of Computational Science and Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Multicore processors and a variety of accelerators have allowed scientific applications to scale to larger problem sizes. We present a performance, design methodology, platform, and architectural comparison of several application accelerators executing a Quantum Monte Carlo application. We compare the application's performance and programmability on a variety of platforms including CUDA with Nvidia GPUs, Brook+ with ATI graphics accelerators, OpenCL running on both multicore and graphics processors, C++ running on multicore processors, and a VHDL implementation running on a Xilinx FPGA. We show that OpenCL provides application portability between multicore processors and GPUs, but may incur a performance cost. Furthermore, we illustrate that graphics accelerators can make simulations involving large numbers of particles feasible.