On-Demand Grid Application Tuning and Debugging with the NetLogger Activation Service

Authors:
Dan Gunter;Brian L. Tierney;Craig E. Tull;Vibha Virmani
Affiliations:
-;-;-;-
Venue:
GRID '03 Proceedings of the 4th International Workshop on Grid Computing
Year:
2003

Citing 7
Cited 5

Open Metadata Formats: Efficient XML-Based Communication for High Performance Computing

Cluster Computing
The Paradyn Parallel Performance Measurement Tool

Computer
Grid Services for Distributed System Integration

Computer
TOPOMON: A Monitoring Tool for Grid Network Topology

ICCS '02 Proceedings of the International Conference on Computational Science-Part II
Efficient Matching for Web-Based Publish/Subscribe Systems

CooplS '02 Proceedings of the 7th International Conference on Cooperative Information Systems
The NetLogger Methodology for High Performance Distributed Systems Performance Analysis

HPDC '98 Proceedings of the 7th IEEE International Symposium on High Performance Distributed Computing
Dynamic Monitoring of High-Performance Distributed Applications

HPDC '02 Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing

Deep scientific computing requires deep data

IBM Journal of Research and Development
DiPerF: An Automated DIstributed PERformance Testing Framework

GRID '04 Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing
RMF: Resource monitoring framework for integrating active and passive monitoring tools in Grid environments

Journal of Parallel and Distributed Computing - Special issue: Design and performance of networks for super-, cluster-, and grid-computing: Part II
Log summarization and anomaly detection for troubleshooting distributed systems

GRID '07 Proceedings of the 8th IEEE/ACM International Conference on Grid Computing
A Grid resource brokering strategy based on resource and network performance in Grid

Future Generation Computer Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Typical Grid computing scenarios involve manydistributed hardware and software components. The morecomponents that are involved, the more likely it is that oneof them may fail. In order for Grid computing to succeed,there must be a simple mechanism to determine whichcomponent failed and why. Instrumentation of all Gridapplications and middleware is an important part of thesolution to this problem. However, it must be possible tocontrol and adapt the amount of instrumentation dataproduced in order to not be flooded by this data. In thispaper we describe a scalable, high-performanceinstrumentation activation mechanism that addresses thisproblem.