A middleware architecture for distributed systems management
Journal of Parallel and Distributed Computing - Special issue on middleware
A case study in configuration management tool deployment
LISA '05 Proceedings of the 19th conference on Large Installation System Administration Conference - Volume 19
Tree-based overlay networks for scalable applications
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Easy and reliable cluster management: the self-management experience of fire phoenix
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Hi-index | 0.00 |
Systems administrators of large clusters often need to perform the same administrative task hundreds or thousands of times. Administrators have traditionally performed some time-consuming tasks, such as operating system installation, configuration, and maintenance, manually. By combining network services such as DHCP, TFTP, FTP, HTTP, and NFS with remote hardware control and scripted installation, configuration, and maintenance techniques, cluster administrators can automate these administrative tasks.Scalable cluster administration addresses this challenge: What hardware and software design techniques can cluster builders use to automate cluster administration on very large clusters? We describe the approach used in the Mathematics and Computer Science Division of Argonne National Laboratory on Chiba City I, a 314-node Linux cluster; and we analyze the scalability, flexibility, performance and reliability benefits and limitations from that approach.