FPGA acceleration of a quantum Monte Carlo application

  • Authors:
  • Akila Gothandaraman;Gregory D. Peterson;G. L. Warren;Robert J. Hinde;Robert J. Harrison

  • Affiliations:
  • Department of Electrical Engineering and Computer Science, University of Tennessee, 414 Ferris Hall, Knoxville, TN 37996-2100, United States;Department of Electrical Engineering and Computer Science, University of Tennessee, 414 Ferris Hall, Knoxville, TN 37996-2100, United States;Department of Chemistry, University of Delaware, United States;Department of Chemistry, University of Tennessee, Knoxville, United States;Department of Chemistry, University of Tennessee, Knoxville, United States

  • Venue:
  • Parallel Computing
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Quantum Monte Carlo methods enable us to determine the ground-state properties of atomic or molecular clusters. Here, we present a reconfigurable computing architecture using Field Programmable Gate Arrays (FPGAs) to accelerate two computationally intensive kernels of a Quantum Monte Carlo (QMC) application applied to N-body systems. We focus on two key kernels of the QMC application: acceleration of potential energy and wave function calculations. We compare the performance of our application on two reconfigurable platforms. Firstly, we use a dual-processor 2.4GHz Intel Xeon augmented with two reconfigurable development boards consisting of Xilinx Virtex-II Pro FPGAs. Using this platform, we achieve a speedup of 3x over a software-only implementation. Following this, the chemistry application is ported to the Cray XD1 supercomputer equipped with Xilinx Virtex-II Pro and Virtex-4 FPGAs. The hardware-accelerated application on one node of the high performance system equipped with a single Virtex-4 FPGA yields a speedup of approximately 25x over the serial reference code running on one node of the dual-processor dual-core 2.2GHz AMD Opteron. This speedup is mainly attributed to the use of pipelining, the use of fixed-point arithmetic for all calculations and the fine-grained parallelism using FPGAs. We can further enhance the performance by operating multiple instances of our design in parallel.