Computing Properties of Large Scalable and Fault-Tolerant Logical Networks

  • Authors:
  • Christophe Cerin;Michel Koskas;Yu Lei

  • Affiliations:
  • -;-;-

  • Venue:
  • SBAC-PAD '11 Proceedings of the 2011 23rd International Symposium on Computer Architecture and High Performance Computing
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

As the number of processors embedded in high performance computing platforms becomes higher and higher, it is vital to force the developers to enhance the scalability of their codes in order to exploit all the resources of the platforms. This often requires new algorithms, techniques and methods for code development that add to the application code new properties: the presence of faults is no more an occasional event but a challenge. Scalability and Fault-Tolerance issues are also present in hidden part of any platform: the overlay network that is necessary to build for controlling the application or in the runtime system support for messaging which is also required to be scalable and fault tolerant. In this paper, we focus on the computational challenges to experiment with large scale (many millions of nodes) logical topologies. We compute Fault-Tolerant properties of different variants of Binomial Graphs (BMG) that are generated at random. For instance, we exhibit interesting properties regarding the number of links regarding some desired Fault-Tolerant properties and we compare different metrics with the Binomial Graph structure as the reference structure. A software tool has been developed for this study and we show experimental results with topologies containing 21000 nodes. We also explain the computational challenge when we deal with such large scale topologies and we introduce various probabilistic algorithms to solve the problems of computing the conventional metrics.