Optimizing 10-Gigabit Ethernet for Networks of Workstations, Clusters, and Grids: A Case Study
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Efficient and Safe Execution of User-Level Code in the Kernel
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 10 - Volume 11
Reliability challenges in large systems
Future Generation Computer Systems
Logging kernel events on clusters
Future Generation Computer Systems
End-system aware, rate-adaptive protocol for network transport in LambdaGrid environments
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Using performance reflection in systems software
HOTOS'03 Proceedings of the 9th conference on Hot Topics in Operating Systems - Volume 9
Reliability challenges in large systems
Future Generation Computer Systems
Logging kernel events on clusters
Future Generation Computer Systems
RAPID: an end-system aware protocol for intelligent data transfer over lambda grids
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Hi-index | 0.00 |
As computing systems grow in complexity, the clusterand grid communities require more sophisticated tools todiagnose, debug and analyze such systems. We have developeda toolkit called MAGNET (Monitoring Apparatusfor General kerNel-Event Tracing) that provides a detailedlook at operating-system kernel events with very low overhead.Using the fine-grained information that MAGNETexports from kernel space, challenging problems becomeamenable to identification and correction.In this paper, we first present the design, implementationand evaluation of MAGNET. Then, we show its use as adiagnostic tool, an online-monitoring tool and a tool forbuilding adaptive applications in clusters and grids.