Parallel program debugging with on-the-fly anomaly detection

  • Authors:
  • Robert Hood;Ken Kennedy;John Mellor-Crummey

  • Affiliations:
  • Department of Computer Science, Rice University, P. O. Box 1892, Houston, TX;Department of Computer Science, Rice University, P. O. Box 1892, Houston, TX;Department of Computer Science, Rice University, P. O. Box 1892, Houston, TX

  • Venue:
  • Proceedings of the 1990 ACM/IEEE conference on Supercomputing
  • Year:
  • 1990

Quantified Score

Hi-index 0.00

Visualization

Abstract

We describe an approach for parallel debugging that coordinates static analysis with efficient on-the-fly access anomaly detection. We are developing on-the-fly instrumentation mechanisms for the structured synchronization primitives of Parallel Computing Forum (PCF) Fortran, the emerging standard for parallel Fortran. For programs without nested parallelism, it is possible to bound the cost of detection to a small constant at each shared access and thread creation point—in preliminary experiments this overhead is less than 40%. Our instrumentation techniques guarantee that we can isolate schedule-dependent behavior in a schedule-independent fashion. The result is that a single instrumented execution will either report sources of schedule-dependent behavior, or it will validate that all executions of the program on the same data compute the same result. When an instrumented execution is being used solely to find sources of schedule-dependent behavior, its cost can be reduced by slicing out computations that do not contribute to race conditions. Our approach to debugging is particularly well-suited for inclusion in a parallel program development environment; we describe our ongoing efforts to incorporate it in the ParaScope environment.