Tuning solution of large non-Hermitian linear systems on multiple graphics processing unit accelerated workstations

  • Authors:
  • Florian Ries;Tommaso De Marco;Roberto Guerrieri

  • Affiliations:
  • Advanced Research Center on Electronic Systems for Information and Communication Technologies, E. De Castro (ARCES), Viale Carlo Pepoli 3/2, 40123 Bologna, Italy;Advanced Research Center on Electronic Systems for Information and Communication Technologies, E. De Castro (ARCES), Viale Carlo Pepoli 3/2, 40123 Bologna, Italy;Advanced Research Center on Electronic Systems for Information and Communication Technologies, E. De Castro (ARCES), Viale Carlo Pepoli 3/2, 40123 Bologna, Italy

  • Venue:
  • International Journal of High Performance Computing Applications
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

This work deals with the solution of large non-Hermitian linear systems on desktop workstations with multiple graphics processing units (GPUs). While our implementation is motivated by the need to accelerate volume conductor modeling for bioelectrical brain imaging, the problem itself is common in scientific computing. Whenever a complex partial differential equation is numerically solved, a typically non-Hermitian sparse complex linear system needs to be solved. For problem sizes in the millions, this can take a long time even with highly optimized CPU-based solvers. Our GPU-accelerated solver outperforms an optimized OpenMP-based reference running on two quad-core CPUs by a factor of up to 31- in single precision and up to 7- in double precision, at the cost of a very modest hardware upgrade of two dual-GPU GTX 295 graphics cards. A pair of stronger Fermi GPUs (GTX 480) achieves speedups of 30- in single precision and 15- in double precision.