A mechanism for efficient debugging of parallel programs
PADD '88 Proceedings of the 1988 ACM SIGPLAN and SIGOPS workshop on Parallel and distributed debugging
Race Frontier: reproducing data races in parallel-program debugging
PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
Optimistic parallelization of communicating sequential processes
PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
Event-based performance perturbation: a case study
PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
Parallel program debugging with on-the-fly anomaly detection
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Restoring consistent global states of distributed computations
PADD '91 Proceedings of the 1991 ACM/ONR workshop on Parallel and distributed debugging
What are race conditions?: Some issues and formalizations
ACM Letters on Programming Languages and Systems (LOPLAS)
MemSpy: analyzing memory system bottlenecks in programs
SIGMETRICS '92/PERFORMANCE '92 Proceedings of the 1992 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Parallel program performance metrics: a comprison and validation
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Adaptive message logging for incremental replay of message-passing programs
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
PVM: Parallel virtual machine: a users' guide and tutorial for networked parallel computing
PVM: Parallel virtual machine: a users' guide and tutorial for networked parallel computing
Distributed snapshots: determining global states of distributed systems
ACM Transactions on Computer Systems (TOCS)
The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Exploiting hardware performance counters with flow and context sensitive profiling
Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Continuous profiling: where have all the cycles gone?
Proceedings of the sixteenth ACM symposium on Operating systems principles
Using cost to control instrumentation overhead
Theoretical Computer Science - Special issue on parallel computing
Time, clocks, and the ordering of events in a distributed system
Communications of the ACM
Performance analysis using the MIPS R10000 performance counters
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
IPS-2: The Second Generation of a Parallel Program Measurement System
IEEE Transactions on Parallel and Distributed Systems
Efficient race detection for message-passing programs with nonblocking sends and receives
SPDP '95 Proceedings of the 7th IEEE Symposium on Parallel and Distributeed Processing
Focusing processor policies via critical-path prediction
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
A Tool to Help Tune where Computation Is Performed
IEEE Transactions on Software Engineering
Online Computation of Critical Paths for Multithreaded Languages
IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
Quantifying instruction criticality for shared memory multiprocessors
Proceedings of the fifteenth annual ACM symposium on Parallel algorithms and architectures
An API for Runtime Code Patching
International Journal of High Performance Computing Applications
Design and Evaluation of Dynamic Key Message Algorithms for Cluster Computing
HPCASIA '05 Proceedings of the Eighth International Conference on High-Performance Computing in Asia-Pacific Region
A regression-based approach to scalability prediction
Proceedings of the 22nd annual international conference on Supercomputing
Using link gradients to predict the impact of network latency on multitier applications
IEEE/ACM Transactions on Networking (TON)
PerWiz: a what-if prediction tool for tuning message passing programs
VECPAR'04 Proceedings of the 6th international conference on High Performance Computing for Computational Science
PATMOS'06 Proceedings of the 16th international conference on Integrated Circuit and System Design: power and Timing Modeling, Optimization and Simulation
Critical lock analysis: diagnosing critical section bottlenecks in multithreaded applications
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Hi-index | 0.00 |
In this paper, we introduce a runtime, nontrace-based algorithm to compute the critical path profile of the execution of message passing and shared-memory parallel programs. Our algorithm permits starting or stopping the critical path computation during program execution and reporting intermediate values. We also present an online algorithm to compute a variant of critical path, called critical path zeroing, that measures the reduction in application execution time that improving a selected procedure will have. Finally, we present a brief case study to quantify the runtime overhead of our algorithm and to show that online critical path profiling can be used to find program bottlenecks.