A Resource Management Architecture for Metacomputing Systems
IPPS/SPDP '98 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Supermon: A High-Speed Cluster Monitoring System
CLUSTER '02 Proceedings of the IEEE International Conference on Cluster Computing
Distributed computing in practice: the Condor experience: Research Articles
Concurrency and Computation: Practice & Experience - Grid Performance
Building a Linux Hpc Cluster With Xcat
Building a Linux Hpc Cluster With Xcat
Hi-index | 0.00 |
In designing the management system for Blue Gene/L, we adopted a database-centric approach, All configuration and operational data for a particular Blue Gene/L system are stored in a relational database that is kept in the system's service node. The database also serves as the communication bus for the various processes implementing the management system. This design offers many advantages, including the ability to use SQL commands to retrieve reliability, availability, and serviceability (RAS) information about the system, Information about machine partitioning and user jobs can be obtained the same way. Leveraging the database, we have developed a web interface for system management. This management system has been successfully implemented and deployed in all 19 Blue Gene/L installations at the time of this writing.