Energy and performance characteristics of different parallel implementations of scientific applications on multicore systems

  • Authors:
  • Charles Lively; Xingfu Wu;Valerie Taylor;Shirley Moore;Hung-Ching Chang;Kirk Cameron

  • Affiliations:
  • Department of Computer Science & Engineering, TexasA&M University, USA;Department of Computer Science & Engineering, TexasA&M University, USA;Department of Computer Science & Engineering, TexasA&M University, USA;Department of Electrical Engineering and Computer Science,University of Tennessee-Knoxville, USA;Department of Computer Science, Virginia Tech, USA;Department of Computer Science, Virginia Tech, USA

  • Venue:
  • International Journal of High Performance Computing Applications
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Energy consumption is a major concern with high-performance multicore systems. In this paper, we explore the energy consumption and performance (execution time) characteristics of different parallel implementations of scientific applications. In particular, the experiments focus on message-passing interface (MPI)-only versus hybrid MPI/OpenMP implementations for hybrid the NAS (NASA Advanced Supercomputing) BT (Block Tridiagonal) benchmark (strong scaling), a Lattice Boltzmann application (strong scaling), and a Gyrokinetic Toroidal Code â聙聰 GTC (weak scaling), as well as central processing unit (CPU) frequency scaling. Experiments were conducted on a system instrumented to obtain power information; this system consists of eight nodes with four cores per node. The results indicate, with respect to the MPI-only versus the hybrid implementation, that the best implementation is dependent upon the application executed on 16 or fewer cores. For the case of 32 cores, the results were consistent in that hybrid implementation resulted in less execution time and energy. With CPU frequency scaling, the best case for energy saving was not the best case for execution time.