A data-driven approach for executing the CG method on reconfigurable high-performance systems

Authors:
Fabian Nowak;Ingo Besenfelder;Wolfgang Karl;Mareike Schmidtobreick;Vincent Heuveline
Affiliations:
Chair for Computer Architecture, Karlsruhe Institute of Technology, Germany;Chair for Computer Architecture, Karlsruhe Institute of Technology, Germany;Chair for Computer Architecture, Karlsruhe Institute of Technology, Germany;Engineering Mathematics and Computing Lab, Karlsruhe Institute of Technology, Germany;Engineering Mathematics and Computing Lab, Karlsruhe Institute of Technology, Germany
Venue:
ARCS'13 Proceedings of the 26th international conference on Architecture of Computing Systems
Year:
2013

Citing 10
Cited 0

Iterative Methods for Sparse Linear Systems

Iterative Methods for Sparse Linear Systems
A Hybrid Approach for Mapping Conjugate Gradient onto an FPGA-Augmented Reconfigurable Supercomputer

FCCM '06 Proceedings of the 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Implicit and explicit optimizations for stencil computations

Proceedings of the 2006 workshop on Memory system performance and correctness
An Implementation of the Conjugate Gradient Algorithm on FPGAs

FCCM '08 Proceedings of the 2008 16th International Symposium on Field-Programmable Custom Computing Machines
Optimized Stencil Computation Using In-Place Calculation on Modern Multicore Systems

Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures

Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
A Sparse Matrix Personality for the Convey HC-1

FCCM '11 Proceedings of the 2011 IEEE 19th Annual International Symposium on Field-Programmable Custom Computing Machines
Active pebbles: parallel programming for data-driven applications

Proceedings of the international conference on Supercomputing
Streaming-Enabled Parallel Data Flow Framework in the Visualization ToolKit

Computing in Science and Engineering
FPGA implementation of the conjugate gradient method

PPAM'05 Proceedings of the 6th international conference on Parallel Processing and Applied Mathematics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Employing reconfigurable computing systems for numerical applications poses an interesting and promising approach toward increased performance. We study the applicability of the Convey HC-1 for numerical applications by decomposing a preconditioned conjugate gradient (CG) method into several independent kernels that can operate concurrently. To allow overlapped execution and to minimize data transfers, we stream the data between the kernel units using a central buffer set. A microprogrammable control unit orchestrates memory accesses, buffer writes/reads and kernel execution, and allows for further algorithms to be executedon the available kernel units. Solving the Poisson problem can thereby be accelerated up to 10 times compared to a single-threaded software version on the HC-1 and up to 1.2 times compared to a 2-socket hex-core Intel Xeon Westmere system with 24 hardware threads for large problem sizes with only a single application engine.