Contention-aware node allocation policy for high-performance capacity systems

  • Authors:
  • Ana Jokanovic;Cyriel Minkenberg;Jose Carlos Sancho;Ramon Beivide;German Rodriguez;Jesus Labarta

  • Affiliations:
  • Barcelona, Supercomputing Center, Barcelona, Spain;IBM Research -- Zurich, Rüschlikon, Switzerland;Barcelona, Supercomputing Center, Barcelona, Spain;University of Cantabria, Santander, Spain;IBM Research -- Zurich, Rúschlikon, Switzerland;Barcelona, Supercomputing Center, Barcelona, Spain

  • Venue:
  • Proceedings of the 2012 Interconnection Network Architecture: On-Chip, Multi-Chip Workshop
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Inter-application network contention is seen as a major hurdle to achieve higher throughput in today's large-scale high-performance capacity systems. This effect is aggravated by current system schedulers that allocate jobs as soon as nodes become available, thus producing job fragmentation, i.e., the tasks of one job might be spread throughout the system instead of being allocated contiguously. This fragmentation increases the probability of sharing network resources with other applications, which produces higher inter-application network contention. In this paper, we propose the use of a contention-aware node allocation technique. This technique is based on identifying which applications are most prone to causing a big impact on inter-application contention and obtaining a more contiguous allocation for these particular workloads. We demonstrate that, although a contiguous node allocation on slimmed fat-tree topologies may increase intra-application contention, the reduction on inter-application contention is more significant. Simulation experiments on a 2,048-node system running multiple applications showed that this technique reduces contention time by up to 35%.