Paradigmatic shifts for exascale supercomputing

  • Authors:
  • Neal E. Davis;Robert W. Robey;Charles R. Ferenbaugh;David Nicholaeff;Dennis P. Trujillo

  • Affiliations:
  • XCP-4 Methods & Algorithms, Los Alamos National Laboratory, Los Alamos, USA and Department of Nuclear, Plasma, & Radiological Engineering, University of Illinois at Urbana---Champaign, Urbana, USA;XCP-2 Eulerian Applications, Los Alamos National Laboratory, Los Alamos, USA;HPC-1 Scientific Software Engineering, Los Alamos National Laboratory, Los Alamos, USA;XCP-4 Methods & Algorithms, Los Alamos National Laboratory, Los Alamos, USA and Department of Physics & Astronomy, University of California at Los Angeles, Los Angeles, USA;XCP-4 Methods & Algorithms, Los Alamos National Laboratory, Los Alamos, USA and Department of Physics, New Mexico State University, Las Cruces, USA

  • Venue:
  • The Journal of Supercomputing
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

As the next generation of supercomputers reaches the exascale, the dominant design parameter governing performance will shift from hardware to software. Intelligent usage of memory access, vectorization, and intranode threading will become critical to the performance of scientific applications and numerical calculations on exascale supercomputers. Although challenges remain in effectively programming the heterogeneous devices likely to be utilized in future supercomputers, new languages and tools are providing a pathway for application developers to tackle this new frontier. These languages include open programming standards such as OpenCL and OpenACC, as well as widely-adopted languages such as CUDA; also of importance are high-quality libraries such as CUDPP and Thrust. This article surveys a purposely diverse set of proof-of-concept applications developed at Los Alamos National Laboratory. We find that the capability level of the accelerator computing hardware and languages has moved beyond the regular grid finite difference calculations and molecular dynamics codes. More advanced applications requiring dynamic memory allocation, such as cell-based adaptive mesh refinement, can now be addressed--and with more effort even unstructured mesh codes can be moved to the GPU.