Journal of Parallel and Distributed Computing - Special issue on tools and methods for visualization of parallel systems and computations
Dynamic control of performance monitoring on large scale parallel systems
ICS '93 Proceedings of the 7th international conference on Supercomputing
Tcl and the Tk toolkit
Synthetic-perturbation techniques for screening shared memory programs
Software—Practice & Experience
Synthetic-perturbation tuning of MIMD programs
The Journal of Supercomputing
Using MPI: portable parallel programming with the message-passing interface
Using MPI: portable parallel programming with the message-passing interface
Software—Practice & Experience
The Legion vision of a worldwide virtual computer
Communications of the ACM
Software—Practice & Experience
Integrated visualization of parallel program performance data
Parallel Computing - Special double issue on environment and tools for parallel scientific computing
Tcl/Tk tools
An empirically derived framework for classifying parallel program performance tuning problems
SPDT '98 Proceedings of the SIGMETRICS symposium on Parallel and distributed tools
Portable profiling and tracing for parallel, scientific applications using C++
SPDT '98 Proceedings of the SIGMETRICS symposium on Parallel and distributed tools
Specifying resources and services in metacomputing environments
Parallel Computing - Special issue on applications
Automatically tuned linear algebra software
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Traceview: A Trace Visualization Tool
IEEE Software
Visualizing the Performance of Parallel Programs
IEEE Software
Application Execution Steering using On-the-Fly Performance Prediction
HPCN Europe 1998 Proceedings of the International Conference and Exhibition on High-Performance Computing and Networking
Capturing and automating performance diagnosis: the Poirot approach
IPPS '95 Proceedings of the 9th International Symposium on Parallel Processing
ZOO: A Desktop Experiment Management Environment
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Falcon: on-line monitoring and steering of large-scale parallel programs
FRONTIERS '95 Proceedings of the Fifth Symposium on the Frontiers of Massively Parallel Computation (Frontiers'95)
Autopilot: Adaptive Control of Distributed Applications
HPDC '98 Proceedings of the 7th IEEE International Symposium on High Performance Distributed Computing
Prediction and Adaptation in Active Harmony
HPDC '98 Proceedings of the 7th IEEE International Symposium on High Performance Distributed Computing
Matchmaking: Distributed Resource Management for High Throughput Computing
HPDC '98 Proceedings of the 7th IEEE International Symposium on High Performance Distributed Computing
Specification of Performance Problems in MPI Programs with ASL
ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
(R) Towards Automatic Performance Analysis
ICPP '96 Proceedings of the Proceedings of the 1996 International Conference on Parallel Processing - Volume 3
Toward a Machine Assisted Software Performance Diagnosis Methodology
Toward a Machine Assisted Software Performance Diagnosis Methodology
Visualizing and Modeling Categorical Time Series Data
Visualizing and Modeling Categorical Time Series Data
CHITRA94: A Tool to Dynamically Charaterize Ensembles of Traces for Input Data Modeling and Output Analsis
Experiment management support for parallel performance tuning
Experiment management support for parallel performance tuning
Performance visualization of parallel programs
VIS '93 Proceedings of the 4th conference on Visualization '93
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Knowledge support and automation for performance analysis with PerfExplorer 2.0
Scientific Programming - Large-Scale Programming Tools and Environments
Model-Based relative performance diagnosis of wavefront parallel computations
HPCC'06 Proceedings of the Second international conference on High Performance Computing and Communications
Hi-index | 0.00 |
This paper describes a design and prototype implementation of a performance tool designed to answer performance questions that span multiple program executions from all stages of the lifespan of an application. We use the scientific experimentation archetype as a basis for designing an Experiment Management environment for parallel performance. In our model, information from all experiments for one application, including the components of the code executed, the execution environment, and the performance data collected, is gathered in a Program Space. Our Experiment Management tool enables exploration of this space with a simple naming mechanism, a selection and query facility, and a set of visualizations. A key component of this work is the ability to automatically describe the differences between two runs of a program, both the structural differences (differences in program source code and the resources used at runtime), and the performance variation (how were the resources used and how did this change from one run to the next).We present a new approach to automated performance diagnosis that incorporates knowledge from previous runs of the same application. The result is a performance tool that learns from each diagnostic program run, adapting its search strategy to obtain more useful diagnoses more quickly. We show performance gains of up to 98% obtained by incorporating historical knowledge into the Performance Consultant's search strategy. The results presented demonstrate the utility of our approach for repeated performance diagnosis of similar program runs, a common scenario when tuning parallel applications. The improvements achieved show that our new approach to gathering and storing historical application data can be successfully applied to the problem of automating performance diagnosis.