Extensible, Scalable Monitoring for Clusters of Computers

  • Authors:
  • Eric Anderson;Dave Patterson

  • Affiliations:
  • U. C. Berkeley;U. C. Berkeley

  • Venue:
  • LISA '97 Proceedings of the 11th USENIX conference on System administration
  • Year:
  • 1997

Quantified Score

Hi-index 0.00

Visualization

Abstract

We describe the CARD (Cluster Administration using Relational Databases) system for monitoring large clusters of cooperating computers. CARD scales both in capacity and in visualization to at least 150 machines, and can in principle scale far beyond that. The architecture is easily extensible to monitor new cluster software and hardware. CARD detects and automatically recovers from common faults. CARD uses a Java applet as its primary interface allowing users anywhere in the world to monitor the cluster through their browser.