A three-dimensional approach to parallel matrix multiplication
IBM Journal of Research and Development
Compiler-optimized simulation of large-scale applications on high performance architectures
Journal of Parallel and Distributed Computing - Parallel and Distributed Discrete Event Simulation--An Emerging Technology
A Flexible Class of Parallel Matrix Multiplication Algorithms
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Scalability of Parallel Algorithms for Matrix Multiplication
ICPP '93 Proceedings of the 1993 International Conference on Parallel Processing - Volume 03
Anatomy of high-performance matrix multiplication
ACM Transactions on Mathematical Software (TOMS)
Performance prediction of large-scale parallell system and application using macro-level simulation
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
WARPP: a toolkit for simulating high-performance parallel scientific codes
Proceedings of the 2nd International Conference on Simulation Tools and Techniques
A universal modular ACTOR formalism for artificial intelligence
IJCAI'73 Proceedings of the 3rd international joint conference on Artificial intelligence
The International Exascale Software Project roadmap
International Journal of High Performance Computing Applications
A Simulator for Large-Scale Parallel Computer Architectures
International Journal of Distributed Systems and Technologies
Aspen: a domain specific language for performance modeling
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Performance modelling of magnetohydrodynamics codes
EPEW'12 Proceedings of the 9th European conference on Computer Performance Engineering
Performance modelling of magnetohydrodynamics codes
EPEW'12 Proceedings of the 9th European conference on Computer Performance Engineering
Modeling synthetic aperture radar computation with Aspen
International Journal of High Performance Computing Applications
Validation and uncertainty assessment of extreme-scale HPC simulation through bayesian inference
Euro-Par'13 Proceedings of the 19th international conference on Parallel Processing
A communications simulation methodology for AMR codes using task dependency analysis
IA^3 '13 Proceedings of the 3rd Workshop on Irregular Applications: Architectures and Algorithms
Hi-index | 0.00 |
A key problem facing application developers is that they are expected to utilize extreme levels of parallelism soon after delivery of future leadership class machines, but developing applications capable of exposing sufficient concurrency is a time consuming process requiring experimentation. At the same time, due to the expense of building and operating an exascale machine, it will be necessary to apply tighter engineering margins to their design. Simple metrics such as the computation-communication ratio will not sufficiently specify machine requirements. Simulation fills this gap, allowing the study of extreme-scale architectures with the explicit inclusion of the complex interactions between the various hardware and software components, and can be used for correctness-checking as well as performance estimation. The simulator we discuss in this paper can be driven by reading trace files, typically generated by an actual application that has been run on real hardware, or by using a skeleton application. The skeleton application is designed to have the control flow of a real application, but with expensive computations and large data transfers replaced by discrete events for which the timings are determined by models. Using skeleton applications, we can predict application performance at levels of parallelism unobtainable on any current computational platform. The skeleton application can be modified to experiment with different communication strategies and programming models. Since the machine being simulated is in our control, we can experiment with different network topologies, routing algorithms, bandwidths, latencies, failure modes, core-to-node ratios, etc. In this paper, we use the Structural Simulation Toolkit macroscale components for coarse-grained simulation to illustrate the exploration of alternative programming models at extreme scale.