Analyzing scheduling policies using Dimemas
Parallel Computing - Special double issue on environment and tools for parallel scientific computing
From trace generation to visualization: a performance framework for distributed parallel systems
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
HPCVIEW: A Tool for Top-down Analysis of Node Performance
The Journal of Supercomputing
VGV: Supporting Performance Analysis of Object-Oriented Mixed MPI/OpenMP Parallel Applications
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
The Hardware Performance Monitor Toolkit
Euro-Par '01 Proceedings of the 7th International Euro-Par Conference Manchester on Parallel Processing
SIGMA: a simulator infrastructure to guide memory analysis
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
A framework for performance modeling and prediction
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Gprof: A call graph execution profiler
SIGPLAN '82 Proceedings of the 1982 SIGPLAN symposium on Compiler construction
HPDC '99 Proceedings of the 8th IEEE International Symposium on High Performance Distributed Computing
SvPablo: A Multi-Language Architecture-Independent Performance Analysis System
ICPP '99 Proceedings of the 1999 International Conference on Parallel Processing
Automatic performance analysis of hybrid MPI/OpenMP applications
Journal of Systems Architecture: the EUROMICRO Journal - Special issue: Evolutions in parallel distributed and network-based processing
A Portable Programming Interface for Performance Evaluation on Modern Processors
International Journal of High Performance Computing Applications
Quantifying load imbalance on virtualized enterprise servers
Proceedings of the first joint WOSP/SIPEW international conference on Performance engineering
Scalable Identification of Load Imbalance in Parallel Executions Using Call Path Profiles
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Processing data streams with hard real-time constraints on heterogeneous systems
Proceedings of the international conference on Supercomputing
Hi-index | 0.00 |
Scientific applications should be well balanced in order to achieve high scalability on current and future high end massively parallel systems. However, the identification of sources of load imbalance in such applications is not a trivial exercise, and the current state of the art in performance analysis tools do not provide an efficient mechanism to help users to identify the main areas of load imbalance in an application. In this paper we discuss a new set of metrics that we defined to identify and measure application load imbalance. We then describe the extensions that were made to the Cray performance measurement and analysis infrastructure to detect application load imbalance and present to the user in an insightful way.