A database-centric approach to system managemant in the blue gene/L supercomputer

  • Authors:
  • Ralf Bellofatto;Paul G. Crumley;David Darrington;Brant knudson;Mark Megerian;José E. Moreira;Alda S. Ohmacht;John Orbeck;Don Reed;Greg Stewart

  • Affiliations:
  • IBM Thomas J. Watson Research Center, Yorktown Heights, NY;IBM Thomas J. Watson Research Center, Yorktown Heights, NY;IBM Systems and Technology Group, Rchester, MN;IBM Systems and Technology Group, Rchester, MN;IBM Systems and Technology Group, Rchester, MN;IBM Thomas J. Watson Research Center, Yorktown Heights, NY;IBM Thomas J. Watson Research Center, Yorktown Heights, NY;IBM Systems and Technology Group, Rchester, MN;IBM Systems and Technology Group, Rchester, MN;IBM Systems and Technology Group, Rchester, MN

  • Venue:
  • IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In designing the management system for Blue Gene/L, we adopted a database-centric approach, All configuration and operational data for a particular Blue Gene/L system are stored in a relational database that is kept in the system's service node. The database also serves as the communication bus for the various processes implementing the management system. This design offers many advantages, including the ability to use SQL commands to retrieve reliability, availability, and serviceability (RAS) information about the system, Information about machine partitioning and user jobs can be obtained the same way. Leveraging the database, we have developed a web interface for system management. This management system has been successfully implemented and deployed in all 19 Blue Gene/L installations at the time of this writing.