Modeling Billion-Node Torus Networks Using Massively Parallel Discrete-Event Simulation

  • Authors:
  • Ning Liu;Christopher D. Carothers

  • Affiliations:
  • -;-

  • Venue:
  • PADS '11 Proceedings of the 2011 IEEE Workshop on Principles of Advanced and Distributed Simulation
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Exascale supercomputers will have millions or even hundreds of millions of processing cores and the potential for nearly billion-way parallelism. Exascale compute and data storage architectures will be critically dependent on the interconnection network. The most popular interconnection network for current and future supercomputer systems is the torus (e.g., k-ary, n-cube). This paper focuses on the modeling and simulation of ultra-large-scale torus networks using Rensselaer's Optimistic Simulator System (ROSS). We compare real communication delays between our model and the actual torus network from the Blue Gene/L using 2,048 processors. Our performance experiments demonstrate the ability to simulate million to billion-node torus networks. The torus network model for a 16-million-node configuration shows a high degree of strong scaling when going from 1,024 cores to 32,768 cores on Blue Gene/L with a peak event-rate of nearly 5 billion events per second. Finally, we demonstrate the performance of our torus network model configured with 1-billion-nodes using up to 16,384 Blue Gene/L processors.