The Tool Dæmon Protocol (TDP)

Authors:
Barton Miller;Ana Cortes;Miquel A. Senar;Miron Livny
Affiliations:
University of Wisconsin, Madison;Universitat Autònoma de Barcelona, Spain;Universitat Autònoma de Barcelona, Spain;University of Wisconsin, Madison
Venue:
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Year:
2003

Citing 10
Cited 3

Linda in context

Communications of the ACM
Xlib programming manual (3rd ed.)

Xlib programming manual (3rd ed.)
The Paradyn Parallel Performance Measurement Tool

Computer
A New Approach to Parallel Debugger Architecture

PARA '02 Proceedings of the 6th International Conference on Applied Parallel Computing Advanced Scientific Computing
OCM - An OMIS Compliant Monitoring System

EuroPVM '96 Proceedings of the Third European PVM Conference on Parallel Virtual Machine
The Globus Project: A Status Report

HCW '98 Proceedings of the Seventh Heterogeneous Computing Workshop
Legion-a view from 50,000 feet

HPDC '96 Proceedings of the 5th IEEE International Symposium on High Performance Distributed Computing
Lilith: Scalable Execution of User Code for Distributed Computing

HPDC '97 Proceedings of the 6th IEEE International Symposium on High Performance Distributed Computing
MRNet: A Software-Based Multicast/Reduction Network for Scalable Tools

Proceedings of the 2003 ACM/IEEE conference on Supercomputing
TAG: a Tiny AGgregation service for Ad-Hoc sensor networks

OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading

Logging kernel events on clusters

Future Generation Computer Systems
Logging kernel events on clusters

Future Generation Computer Systems
TDP_SHELL: an interoperability framework for resource management systems and run-time monitoring tools

Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Run-time tools are crucial to program development. In our desktop computer environments, we take for granted the availability of tools for operations such as debugging, profiling, tracing, checkpointing, and visualization. When programs move into distributed or Grid environments, it is difficult to find such tools. This difficulty is caused by the complex interactions necessary between application program, operating system and layers of job scheduling and process management software. As a result, each run-time tool must be individually ported to run under a particular job management system; for m tools and n environments, the problem becomes an m \times n effort, rather than the hoped-for m + n effort. Variations in underlying operating systems can make this problem even worse. The consequence of this situation is a paucity of tools in distributed and Grid computing environments. In response to the problem, we have analyzed a variety of job scheduling environments and run-time tools to better understand their interactions. From this analysis, we isolated what we believe are the essential interactions between the run-time tool, job scheduler and resource manager, and application program. We are proposing a standard interface, called the Tool Dæmon Protocol (TDP) that codifies these interactions and provides the necessary communication functions. We have implemented a pilot TDP library and experimented with Parador, a prototype using the Paradyn Parallel Performance tools profiling jobs running under the Condor batch-scheduling environment.