Introduction to High Performance Computing for Scientists and Engineers

Authors:
Georg Hager;Gerhard Wellein
Affiliations:
-;-
Venue:
Introduction to High Performance Computing for Scientists and Engineers
Year:
2010

Citing 0
Cited 9

High-order commutator-free exponential time-propagation of driven quantum systems

Journal of Computational Physics
Parallel programming: design of an overview class

Proceedings of the 2011 ACM SIGPLAN X10 Workshop
Link-wise artificial compressibility method

Journal of Computational Physics
Expression Templates Revisited: A Performance Analysis of Current Methodologies

SIAM Journal on Scientific Computing
Programming hybrid architectures

Proceedings of the ATIP/A*CRC Workshop on Accelerator Technologies for High-Performance Computing: Does Asia Lead the Way?
Hierarchical parallel approach in vascular network modeling: hybrid MPI+OpenMP implementation

PPAM'11 Proceedings of the 9th international conference on Parallel Processing and Applied Mathematics - Volume Part I
Agent-based simulation for large-scale emergency response: A survey of usage and implementation

ACM Computing Surveys (CSUR)
Application of the ParalleX execution model to stencil-based problems

Computer Science - Research and Development
Pushing the limits for medical image reconstruction on recent standard multicore processors

International Journal of High Performance Computing Applications

Quantified Score

Hi-index	0.01

Visualization

Abstract

Written by high performance computing (HPC) experts, Introduction to High Performance Computing for Scientists and Engineers provides a solid introduction to current mainstream computer architecture, dominant parallel programming models, and useful optimization strategies for scientific HPC. From working in a scientific computing center, the authors gained a unique perspective on the requirements and attitudes of users as well as manufacturers of parallel computers. The text first introduces the architecture of modern cache-based microprocessors and discusses their inherent performance limitations, before describing general optimization strategies for serial code on cache-based architectures. It next covers shared- and distributed-memory parallel computer architectures and the most relevant network topologies. After discussing parallel computing on a theoretical level, the authors show how to avoid or ameliorate typical performance problems connected with OpenMP. They then present cache-coherent nonuniform memory access (ccNUMA) optimization techniques, examine distributed-memory parallel programming with message passing interface (MPI), and explain how to write efficient MPI code. The final chapter focuses on hybrid programming with MPI and OpenMP. Users of high performance computers often have no idea what factors limit time to solution and whether it makes sense to think about optimization at all. This book facilitates an intuitive understanding of performance limitations without relying on heavy computer science knowledge. It also prepares readers for studying more advanced literature.