Low-Cost Non-Intrusive Debugging Strategies for Distributed Parallel Programs

  • Authors:
  • Michael D. Beynon;Henrique Andrade;Joel Saltz

  • Affiliations:
  • -;-;-

  • Venue:
  • CLUSTER '02 Proceedings of the IEEE International Conference on Cluster Computing
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Debugging is an important and challenging component of the software development cycle. The utilization of proper tools that help trace execution, inspect variable values, do postmortem analysis, dynamically attach to running processes, among other tasks can greatly increase programmer productivity by reducing the time to understand incorrect behavior. It is well known that many more hours are spent debugging software than compiling source code [11], therefore having the appropriate tools and making the best use of information provided by them is fundamental. When a programmer deals with parallel programs in shared or distributed memory settings, debugging becomes quite complicated given the various potential interactions that may occur between threads and processes. Data races, deadlocks, synchronization issues, and communication problems must be dealt with in addition to the traditional sequential (single process) problems such as memory leaks, memory overruns, etc. Since debugging is clearly important in both sequential and parallel/distributed environments, much effort has been focused on this task. It ranges from standardization initiatives [6] and list of requirements [3], to specific research prototypes [5, 14, 15], and commercial and open source products [4, 7, 9, 10, 13].GDB [12] is freely available and omnipresent in most research labs and universities, which contributes to its status as a debugging tool of choice for many programmers writing sequential code. On the other hand, it does not target the particular debugging issues presented by parallel/distributed applications as does, for example, TotalView [4]. Nevertheless, GDB's newer versions are well capable of dealing with multi-threaded programs. There are also many capable parallel/cluster programming environments that go beyond the ideas presented in this paper. We present this work as strategies that can be implemented without capital expenditure 1 somewhat easily into existing projects without adopting a programming environment.In this work, we will show how five low-cost and non-intrusive techniques that work using free commodity tools such as GDB can be used to improve the debugging process of multi-threaded and/or distributed parallel programs. These techniques have been used in the development of two major software middlewares 驴 DataCutter [2] and MQO [1] 驴 and have proven their value by lowering the time necessary to detect and correct bugs.