An Online Credential Repository for the Grid: MyProxy
HPDC '01 Proceedings of the 10th IEEE International Symposium on High Performance Distributed Computing
GridICE: a monitoring service for Grid systems
Future Generation Computer Systems - Special issue: High-speed networks and services for data-intensive grids: The DataTAG project
A taxonomy of grid monitoring systems
Future Generation Computer Systems
Globus toolkit version 4: software for service-oriented systems
NPC'05 Proceedings of the 2005 IFIP international conference on Network and Parallel Computing
Proceedings of the 2008 Ambi-Sys workshop on Software Organisation and MonIToring of Ambient Systems
Multi-scale Real-Time Grid Monitoring with Job Stream Mining
CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
Self-Healing of Operational Workflow Incidents on Distributed Computing Infrastructures
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Self-healing of workflow activity incidents on distributed computing infrastructures
Future Generation Computer Systems
Hi-index | 0.01 |
Monitoring in distributed environment such as a grid is crucial for normal operation of all subsystems. Constant gathering of information enables efficient security auditing, failure detection, maintenance, job scheduling, accounting, resource performance tuning, debugging, etc. In this paper we focus on monitoring of resources in the grid with the purpose of failure detection, notifications and automatic recovery. We introduce our system based on open source monitoring framework Nagios that achieves these functionalities. We describe grid specific features we implemented in order to achieve efficient grid monitoring system, namely sensors for various grid services, advanced sensor hierarchy and certificate-based authorization on web interface. Finally, we give overview of the implementation of our system for monitoring EGEE grid infrastructure.