Performance analysis of a synchronous, circuit-switched interconnection cached network
ICS '94 Proceedings of the 8th international conference on Supercomputing
A Performance and Scalability Analysis of the BlueGene/L Architecture
Proceedings of the 2004 ACM/IEEE conference on Supercomputing
Integrated Performance Monitoring of a Cosmology Application on Leading HEC Platforms
ICPP '05 Proceedings of the 2005 International Conference on Parallel Processing
On the Feasibility of Optical Circuit Switching for High Performance Computing Systems
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Optimizing task layout on the Blue Gene/L supercomputer
IBM Journal of Research and Development
Analysis of photonic networks for a chip multiprocessor using scientific applications
NOCS '09 Proceedings of the 2009 3rd ACM/IEEE International Symposium on Networks-on-Chip
Hybrid network on chip (HNoC): local buses with a global mesh architecture
Proceedings of the 12th ACM/IEEE international workshop on System level interconnect prediction
Proceedings of the 9th conference on Computing Frontiers
Design of an application-dependent static-based shared memory network
ICA3PP'12 Proceedings of the 12th international conference on Algorithms and Architectures for Parallel Processing - Volume Part II
A reconfigurable, regular-topology cluster/datacenter network using commodity optical switches
Future Generation Computer Systems
Hi-index | 0.00 |
As we enter the era of peta-scale computing, system architects must plan for machines composed of tens or even hundreds of thousands of processors. Although fully connected networks such as fat-tree configurations currently dominate HPC interconnect designs, such approaches are inadequate for ultra-scale concurrencies due to the superlinear growth of component costs. Traditional low-degree interconnect topologies, such as 3D tori, have reemerged as a competitive solution due to the linear scaling of system components relative to the node count; however, such networks are poorly suited for the requirements of many scientific applications at extreme concurrencies. To address these limitations, we propose HFAST, a hybrid switch architecture that uses circuit switches to dynamically reconfigure lower-degree interconnects to suit the topological requirements of a given scientific application. This work presents several new research contributions. We develop an optimization strategy for HFAST mappings and demonstrate that efficiency gains can be attained across a broad range of static numerical computations. Additionally, we conduct an extensive analysis of the communication characteristics of a dynamically adapting mesh calculation and show that the HFAST approach can achieve significant advantages, even when compared with traditional fat-tree configurations. Overall results point to the promising potential of utilizing hybrid reconfigurable networks to interconnect future peta-scale architectures, for both static and dynamically adapting applications.