Extending the BT NAS parallel benchmark to exascale computing

Authors:
Rob F. Van der Wijngaart;Srinivas Sridharan;Victor W. Lee
Affiliations:
Intel Corporation, Santa Clara, CA;Intel Corporation, Bangalore, India;Intel Corporation, Santa Clara, CA
Venue:
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Year:
2012

Citing 12
Cited 1

Reevaluating Amdahl's law

Communications of the ACM
Implementing the beam and warming method on the hypercube

C3P Proceedings of the third conference on Hypercube concurrent computers and applications - Volume 2
Efficient implementation of a 3-dimensional ADI method on the iPSC/860

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Architectural requirements and scalability of the NAS parallel benchmarks

SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Parallel Computer Architecture: A Hardware/Software Approach

Parallel Computer Architecture: A Hardware/Software Approach
Predictive performance and scalability modeling of a large-scale application

Proceedings of the 2001 ACM/IEEE conference on Supercomputing
A framework for performance modeling and prediction

Proceedings of the 2002 ACM/IEEE conference on Supercomputing
A regression-based approach to scalability prediction

Proceedings of the 22nd annual international conference on Supercomputing
PHANTOM: predicting performance of parallel applications on large-scale parallel machines using a single node

Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
ScalaExtrap: trace-based communication extrapolation for spmd programs

Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
Simulating Whole Supercomputer Applications

IEEE Micro
Performance modeling: understanding the past and predicting the future

Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing

A new proposal to deal with congestion in InfiniBand-based fat-trees

Journal of Parallel and Distributed Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

The NAS Parallel Benchmarks (NPB) are a well-known suite of benchmarks that proxy scientific computing applications. They specify several problem sizes that represent how such applications may run on different sizes of HPC systems. However, even the largest problem (class F) is still far too small to exercise properly a petascale supercomputer. Our work shows how one may scale the Block Tridiagonal (BT) NPB from today's published size to petascale and exascale computing systems. In this paper we discuss the pros and cons of various ways of scaling. We discuss how scaling BT would impact computation, memory access, and communications, and highlight the expected bottleneck, which turns out to be not memory or communication bandwidth, but network latency. Two complementary ways are presented to overcome latency obstacles. We also describe a practical method to gather approximate performance data for BT at exascale on actual hardware, without requiring an exascale system.