DiP: A Parallel Program Development Environment
Euro-Par '96 Proceedings of the Second International Euro-Par Conference on Parallel Processing-Volume II
An overview of the BlueGene/L Supercomputer
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Automatic performance analysis of hybrid MPI/OpenMP applications
Journal of Systems Architecture: the EUROMICRO Journal - Special issue: Evolutions in parallel distributed and network-based processing
An Algebra for Cross-Experiment Performance Analysis
ICPP '04 Proceedings of the 2004 International Conference on Parallel Processing
The Tau Parallel Performance System
International Journal of High Performance Computing Applications
A parallel trace-data interface for scalable performance analysis
PARA'06 Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing
Scalable parallel trace-based performance analysis
EuroPVM/MPI'06 Proceedings of the 13th European PVM/MPI User's Group conference on Recent advances in parallel virtual machine and message passing interface
Introducing the open trace format (OTF)
ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part II
Scientific Programming - Large-Scale Programming Tools and Environments
Scalasca Parallel Performance Analyses of PEPC
Euro-Par 2008 Workshops - Parallel Processing
A parallel trace-data interface for scalable performance analysis
PARA'06 Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing
Adapting system execution traces to support analysis of software system performance properties
Journal of Systems and Software
Hi-index | 0.00 |
Straightforward trace collection and processing becomes increasingly challenging and ultimately impractical for more complex, long-running, highly parallel applications. Accordingly, the SCALASCA project is extending the kojak measurement system for MPI, OpenMP and partitioned global address space (pgas) parallel applications to incorporate runtime management and summarisation capabilities. This offers a more scalable and effective profile of parallel execution performance for an initial overview and to direct instrumentation and event tracing to the key functions and callpaths for comprehensive analysis. The design and re-structuring of the revised measurement system are described, highlighting the synergies possible from integrated runtime callpath summarisation and event tracing for scalable parallel execution performance diagnosis. Early results from measurements of 16,384 MPI processes on IBM BlueGene/L already demonstrate considerably improved scalability.