A Language and System for the Construction and Tuning of Parallel Programs
IEEE Transactions on Software Engineering
Monitoring and performance measuring distributed systems during operation
SIGMETRICS '88 Proceedings of the 1988 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Quartz: a tool for tuning parallel program performance
SIGMETRICS '90 Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems
The integration of application and system based metrics in a parallel program performance tool
PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
A portable platform for distributed event environments
PADD '91 Proceedings of the 1991 ACM/ONR workshop on Parallel and distributed debugging
Performance debugging shared memory multiprocessor programs with MTOOL
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
SPLASH: Stanford parallel applications for shared-memory
ACM SIGARCH Computer Architecture News
Parallel program performance metrics: a comprison and validation
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
A hardware-based performance monitor for the Intel iPSC/2 hypercube
ICS '90 Proceedings of the 4th international conference on Supercomputing
Visualizing the Performance of Parallel Programs
IEEE Software
IPS-2: The Second Generation of a Parallel Program Measurement System
IEEE Transactions on Parallel and Distributed Systems
Gprof: A call graph execution profiler
SIGPLAN '82 Proceedings of the 1982 SIGPLAN symposium on Compiler construction
Debugging techniques for communicating, loosely-coupled processes
Debugging techniques for communicating, loosely-coupled processes
Performance debugging using parallel performance predicates
PADD '93 Proceedings of the 1993 ACM/ONR workshop on Parallel and distributed debugging
Normalized performance indices for message passing parallel programs
ICS '94 Proceedings of the 8th international conference on Supercomputing
An annotated bibliography of interactive program steering
ACM SIGPLAN Notices
A structured approach to instrumentation system development and evaluation
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Shared-memory performance profiling
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
OCM—a monitoring system for interoperable tools
SPDT '98 Proceedings of the SIGMETRICS symposium on Parallel and distributed tools
Automatic detection of parallel program performance problems
SPDT '98 Proceedings of the SIGMETRICS symposium on Parallel and distributed tools
Trace-Based Load Characterization for Generating Performance Software Models
IEEE Transactions on Software Engineering
Improving online performance diagnosis by the use of historical performance data
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
A Tool to Help Tune where Computation Is Performed
IEEE Transactions on Software Engineering
Experiment management support for performance tuning
SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
Parallel Performance Visualization: From Practice to Theory
IEEE Parallel & Distributed Technology: Systems & Technology
Integrating Automatic Techniques in a Performance Analysis Session (Research Note)
Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
A Callgraph-Based Search Strategy for Automated Performance Diagnosis (Distinguished Paper)
Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
Relating the Execution Behaviour with the Structure of the Application
Proceedings of the 6th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Automatic Performance Analysis of Master/Worker PVM Applications with Kpi
Proceedings of the 7th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
VPPB -- A Visualization and Performance Prediction Tool for Multithreaded Solaris Programs
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
A framework for multi-execution performance tuning
On-line monitoring systems and computer tool interoperability
Using queries for distributed monitoring and forensics
Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
An empirical study of hierarchical division for mesh-structured networks
Journal of Computational Methods in Sciences and Engineering - Selected papers from the International Conference on Computer Science, Software Engineering, Information Technology, e-Business, and Applications, 2004
Friday: global comprehension for distributed replay
NSDI'07 Proceedings of the 4th USENIX conference on Networked systems design & implementation
Automatic performance debugging of SPMD-style parallel programs
Journal of Parallel and Distributed Computing
A loop-aware search strategy for automated performance analysis
HPCC'05 Proceedings of the First international conference on High Performance Computing and Communications
CUBIT: compact bitmap profiling for dynamic data dependence analysis
Proceedings of the 2013 Research in Adaptive and Convergent Systems
Hi-index | 0.00 |
Performance monitoring of large scale parallel computers creates a dilemma: we need to collect detailed information to find performance bottlenecks, yet collecting all this data can introduce serious data collection bottlenecks. At the same time, users are being inundated with volumes of complex graphs and tables that require a performance expert to interpret. We present a new approach called the W3 Search Model, that addresses both these problems by combining dynamic on-the-fly selection of what performance data to collect with decision support to assist users with the selection and presentation of performance data. We present a case study describing how a prototype implementation of our technique was able to identify the bottlenecks in three real programs. In addition, we were able to reduce the amount of performance data collected by a factor ranging from 13 to 700 compared to traditional sampling and trace based instrumentation techniques.