Feasibility study and early experimental results towards cluster survivability

  • Authors:
  • C. Leangsuksun;A. Tikotekar;M. Pourzandi;I. Haddad

  • Affiliations:
  • Louisiana Tech Univ., Ruston, LA, USA;Louisiana Tech Univ., Ruston, LA, USA;Nat. e-Sci. Centre, Glasgow Univ., UK;Sch. of Telecommun. Eng., Valladolid Univ., Spain

  • Venue:
  • CCGRID '05 Proceedings of the Fifth IEEE International Symposium on Cluster Computing and the Grid - Volume 01
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper propounds an investigation, a feasibility study, and performance benchmarking of vital management elements for critical enterprise and HPC infrastructure. We propose concepts of integrating high availability cluster mechanism with a secure cluster infrastructure. Our proposed architecture incorporates the distributed security infrastructure (DSI) framework, an open source project providing secure infrastructure for carrier grade clusters, and HA-OSCAR, an open source cluster framework that meets the reliability, availability, serviceability (RAS) needs. The result is a cluster infrastructure that is compliant with the reliability, availability, serviceability and security (RASS) principles. We conducted an initial feasibility study and experiment to gauge issues and the degree of success in the implementation of our proposed RASS framework. We verified the integration of HA-OSCAR release 1.0 and DSI release 0.3. Although there was a minimal performance overhead, having "RASS" in mission critical settings by far outweighs the performance impact. We plan to further our proof-of-concept architecture to suit the required needs on the production environments.