TOPOMON: A Monitoring Tool for Grid Network Topology
ICCS '02 Proceedings of the International Conference on Computational Science-Part II
Efficient Matching for Web-Based Publish/Subscribe Systems
CooplS '02 Proceedings of the 7th International Conference on Cooperative Information Systems
The NetLogger Methodology for High Performance Distributed Systems Performance Analysis
HPDC '98 Proceedings of the 7th IEEE International Symposium on High Performance Distributed Computing
Dynamic Monitoring of High-Performance Distributed Applications
HPDC '02 Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing
Deep scientific computing requires deep data
IBM Journal of Research and Development
DiPerF: An Automated DIstributed PERformance Testing Framework
GRID '04 Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing
Journal of Parallel and Distributed Computing - Special issue: Design and performance of networks for super-, cluster-, and grid-computing: Part II
Log summarization and anomaly detection for troubleshooting distributed systems
GRID '07 Proceedings of the 8th IEEE/ACM International Conference on Grid Computing
A Grid resource brokering strategy based on resource and network performance in Grid
Future Generation Computer Systems
Hi-index | 0.00 |
Typical Grid computing scenarios involve manydistributed hardware and software components. The morecomponents that are involved, the more likely it is that oneof them may fail. In order for Grid computing to succeed,there must be a simple mechanism to determine whichcomponent failed and why. Instrumentation of all Gridapplications and middleware is an important part of thesolution to this problem. However, it must be possible tocontrol and adapt the amount of instrumentation dataproduced in order to not be flooded by this data. In thispaper we describe a scalable, high-performanceinstrumentation activation mechanism that addresses thisproblem.