Analyzing Ultra-Scale Application Communication Requirements for a Reconfigurable Hybrid Interconnect

Authors:
John Shalf;Shoaib Kamil;Leonid Oliker;David Skinner
Affiliations:
Lawrence Berkeley National Laboratory;Lawrence Berkeley National Laboratory;Lawrence Berkeley National Laboratory;Lawrence Berkeley National Laboratory
Venue:
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Year:
2005

Citing 10
Cited 14

Performance analysis of a synchronous, circuit-switched interconnection cached network

ICS '94 Proceedings of the 8th international conference on Supercomputing
Parallel empirical pseudopotential electronic structure calculations for million atom systems

Journal of Computational Physics
Covering edges by cliques with regard to keyword conflicts and intersection graphs

Communications of the ACM
High-cost CFD on a low-cost cluster

Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Gemini: An Optical Interconnection Network for Parallel Processing

IEEE Transactions on Parallel and Distributed Systems
Communication Characteristics of Large-Scale Scientific Applications for Contemporary Cluster Architectures

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
An empirical performance evaluation of scalable scientific applications

Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Separated high-bandwidth and low-latency communication in the cluster interconnect Clint

Proceedings of the 2002 ACM/IEEE conference on Supercomputing
SuperLU_DIST: A scalable distributed-memory sparse direct solver for unsymmetric linear systems

ACM Transactions on Mathematical Software (TOMS)
Scientific Computations on Modern Parallel Vector Systems

Proceedings of the 2004 ACM/IEEE conference on Supercomputing

High-performance and scalable MPI over InfiniBand with reduced memory usage: an in-depth performance analysis

Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Low Diameter Interconnections for Routing in High-Performance Parallel Systems

IEEE Transactions on Computers
Power saving in regular interconnection networks

Parallel Computing
Dynamic power saving in fat-tree interconnection networks using on/off links

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
A compiler-based communication analysis approach for multiprocessor systems

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
NoC-aware cache design for multithreaded execution on tiled chip multiprocessors

Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers
Modeling Billion-Node Torus Networks Using Massively Parallel Discrete-Event Simulation

PADS '11 Proceedings of the 2011 IEEE Workshop on Principles of Advanced and Distributed Simulation
Network-theoretic classification of parallel computation patterns

International Journal of High Performance Computing Applications
Performance analysis of an optical circuit switched network for peta-scale systems

Euro-Par'07 Proceedings of the 13th international Euro-Par conference on Parallel Processing
Power-aware fat-tree networks using on/off links

HPCC'07 Proceedings of the Third international conference on High Performance Computing and Communications
Multiclass classification of distributed memory parallel computations

Pattern Recognition Letters
Implementation and Evaluation of Skip-Links: A Dynamically Reconfiguring Topology for Energy-Efficient NoCs

International Journal of Embedded and Real-Time Communication Systems
Identifying HPC codes via performance logs and machine learning

Proceedings of the first workshop on Changing landscapes in HPC security
A synthetic task model for HPC-grade optical network performance evaluation

IA^3 '13 Proceedings of the 3rd Workshop on Irregular Applications: Architectures and Algorithms

Quantified Score

Hi-index	0.00

Visualization

Abstract

The path towards realizing peta-scale computing is increasingly dependent on scaling up to unprecedented numbers of processors. To prevent the interconnect architecture between processors from dominating the overall cost of such systems, there is a critical need for interconnect solutions that both provide performance to ulta-scale applications and have costs that scale linearly with system size. In this work we propose the Hybrid Flexibly Assignable Switch Topology (HFAST) infrastructure. The HFAST approach uses both passive (circuit switch) and active (packet switch) commodity switch components to deliver all of the flexibility and fault-tolerance of a fully-interconnected network (such as a fat-tree), while preserving the nearly linear cost scaling associated with traditional low-degree interconnect networks. To understand the applicability of this technology, we perform an in-depth study of communication requirements across a broad spectrum of important scientific applications, whose computational methods include: finite-difference, latticebolzmann, particle in cell, sparse linear algebra, particle mesh ewald, and FFT-based solvers. We use the IPM (Integrated Performance Monitoring) profiling layer to gather detailed messaging statistics with minimal impact to code performance. This profiling provides us sufficiently detailed communication topology and message volume data to evaluate these applications in the context of the proposed hybrid interconnect. Overall results show that HFAST is a promising approach for practically addressing the interconnect requirements of future peta-scale systems.