Exascale workload characterization and architecture implications

Authors:
Prasanna Balaprakash;Darius Buntinas;Anthony Chan;Apala Guha;Rinku Gupta;Sri Hari Krishna Narayanan;Andrew A. Chien;Paul Hovland;Boyana Norris
Affiliations:
Argonne National Laboratory;Argonne National Laboratory;Argonne National Laboratory;Argonne National Laboratory and University of Chicago;Argonne National Laboratory;Argonne National Laboratory;Argonne National Laboratory and University of Chicago;Argonne National Laboratory;Argonne National Laboratory
Venue:
Proceedings of the High Performance Computing Symposium
Year:
2013

Citing 10
Cited 0

Efficient management of parallelism in object-oriented numerical software libraries

Modern software tools for scientific computing
Terascale spectral element algorithms and implementations

SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
A Media-Enhanced Vector Architecture for Embedded Memory Systems

A Media-Enhanced Vector Architecture for Embedded Memory Systems
Cross-architecture performance predictions for scientific applications using parameterized models

Proceedings of the joint international conference on Measurement and modeling of computer systems
Analysis and Modeling of Advanced PIM Architecture Design Tradeoffs

Proceedings of the 2004 ACM/IEEE conference on Supercomputing
Pin: building customized program analysis tools with dynamic instrumentation

Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
A performance prediction framework for scientific applications

Future Generation Computer Systems
HPCTOOLKIT: tools for performance analysis of optimized parallel programs http://hpctoolkit.org

Concurrency and Computation: Practice & Experience - Scalable Tools for High-End Computing
The future of microprocessors

Communications of the ACM
A tool for characterizing and succinctly representing the data access patterns of applications

IISWC '11 Proceedings of the 2011 IEEE International Symposium on Workload Characterization

Quantified Score

Hi-index	0.00

Visualization

Abstract

We use a hybrid methodology based on binary instrumentation and performance counters to characterize a set of proxy applications (mini-apps and PETSc applications) representative of a broad range of scientific applications (and particularly DOE's future high performance computing workloads). From this empirical basis, we create statistical models that extrapolate application properties (instruction mix, memory size, and memory bandwidth) as a function of problem size. We validate them and project the first quantitative characterization of an exascale computing workload. Finally, the exascale workload is used to evaluate a radical new exascale architecture, stacked DRAM with processor under memory (PUM). Of the two projections, one shows major potential benefits in using PUM. However, the second, more conservative projection suggests that only a small number of exascale applications are likely to be memory-bandwidth limited, but even these are fundamentally memory-capacity limited.