Flattened butterfly: a cost-efficient topology for high-radix networks

Authors:
John Kim;William J. Dally;Dennis Abts
Affiliations:
Stanford University, Stanford, CA;Stanford University, Stanford, CA;Cray Inc., Chippewa Falls, WI
Venue:
Proceedings of the 34th annual international symposium on Computer architecture
Year:
2007

Citing 15
Cited 41

Fat-trees: universal networks for hardware-efficient supercomputing

IEEE Transactions on Computers
Performance Analysis of k-ary n-cube Interconnection Networks

IEEE Transactions on Computers
The SGI Origin: a ccNUMA highly scalable server

Proceedings of the 24th annual international symposium on Computer architecture
Parallel Prefix Computation

Journal of the ACM (JACM)
Virtual-Channel Flow Control

IEEE Transactions on Parallel and Distributed Systems
Dynamic Voltage Scaling with Links for Power Optimization of Interconnection Networks

HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Power constrained design of multiprocessor interconnection networks

ICCD '97 Proceedings of the 1997 International Conference on Computer Design (ICCD '97)
GOAL: a load-balanced adaptive routing algorithm for torus networks

Proceedings of the 30th annual international symposium on Computer architecture
Energy optimization techniques in cluster interconnects

Proceedings of the 2003 international symposium on Low power electronics and design
Power-driven Design of Router Microarchitectures in On-chip Networks

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Principles and Practices of Interconnection Networks

Principles and Practices of Interconnection Networks
Design-Space Exploration of Power-Aware On/Off Interconnection Networks

ICCD '04 Proceedings of the IEEE International Conference on Computer Design
Microarchitecture of a High-Radix Router

Proceedings of the 32nd annual international symposium on Computer Architecture
The BlackWidow High-Radix Clos Network

Proceedings of the 33rd annual international symposium on Computer Architecture
Adaptive routing in high-radix clos network

Proceedings of the 2006 ACM/IEEE conference on Supercomputing

Technology-Driven, Highly-Scalable Dragonfly Topology

ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
iDEAL: Inter-router Dual-Function Energy and Area-Efficient Links for Network-on-Chip (NoC) Architectures

ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
On Simplifying Placement and Routing by Extending Coarse-Grained Reconfigurable Arrays with Omega Networks

ARC '09 Proceedings of the 5th International Workshop on Reconfigurable Computing: Architectures, Tools and Applications
Indirect adaptive routing on large scale interconnection networks

Proceedings of the 36th annual international symposium on Computer architecture
HyperX: topology, routing, and packaging of efficient large-scale networks

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Outstanding research problems in NoC design: system, microarchitecture, and circuit perspectives

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
On-chip bidirectional wiring for heavily pipelined systems using network coding

ICCD'09 Proceedings of the 2009 IEEE international conference on Computer design
Energy proportional datacenter networks

Proceedings of the 37th annual international symposium on Computer architecture
Evaluating Bufferless Flow Control for On-chip Networks

NOCS '10 Proceedings of the 2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip
Power-Efficient and High-Performance Multi-level Hybrid Nanophotonic Interconnect for Multicores

NOCS '10 Proceedings of the 2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip
Fiber optic communication technologies: what's needed for datacenter network operations

IEEE Communications Magazine
Axon: a flexible substrate for source-routed ethernet

Proceedings of the 6th ACM/IEEE Symposium on Architectures for Networking and Communications Systems
DOS: a scalable optical switch for datacenters

Proceedings of the 6th ACM/IEEE Symposium on Architectures for Networking and Communications Systems
Design of a scalable nanophotonic interconnect for future multicores

Proceedings of the 5th ACM/IEEE Symposium on Architectures for Networking and Communications Systems
On-Chip Network Evaluation Framework

Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
LEGUP: using heterogeneity to reduce the cost of data center network upgrades

Proceedings of the 6th International COnference
A practical low-latency router architecture with wing channel for on-chip network

Microprocessors & Microsystems
Scalable and cost-effective interconnection of data-center servers using dual server ports

IEEE/ACM Transactions on Networking (TON)
The deflection self-routing Delta network: a dynamically fault-tolerant high-radix multistage interconnection network

The Journal of Supercomputing
A case for heterogeneous on-chip interconnects for CMPs

Proceedings of the 38th annual international symposium on Computer architecture
The role of optics in future high radix switch design

Proceedings of the 38th annual international symposium on Computer architecture
2-Dilated flattened butterfly: A nonblocking switching topology for high-radix networks

Computer Communications
A Scalability Study of Enterprise Network Architectures

Proceedings of the 2011 ACM/IEEE Seventh Symposium on Architectures for Networking and Communications Systems
Packet chaining: efficient single-cycle allocation for on-chip networks

Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Applying traffic merging to datacenter networks

Proceedings of the 3rd International Conference on Future Energy Systems: Where Energy, Computing and Communication Meet
CloudRAMSort: fast and efficient large-scale distributed RAM sort on shared-nothing cluster

SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Exploiting communication and packaging locality for cost-effective large scale networks

Proceedings of the 26th ACM international conference on Supercomputing
A case for random shortcut topologies for HPC interconnects

Proceedings of the 39th Annual International Symposium on Computer Architecture
Scale-out processors

Proceedings of the 39th Annual International Symposium on Computer Architecture
PAST: scalable ethernet for data centers

Proceedings of the 8th international conference on Emerging networking experiments and technologies
On the Path to Exascale

International Journal of Distributed Systems and Technologies
Dynamic Reconfiguration of 3D Photonic Networks-on-Chip for Maximizing Performance and Improving Fault Tolerance

MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Research on half-mesh topology based on binary model and HTF-XY routing algorithm

International Journal of Computer Applications in Technology
Scalable high-radix router microarchitecture using a network switch organization

ACM Transactions on Architecture and Code Optimization (TACO)
Memory-centric system interconnect design with hybrid memory cubes

PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
BBQ: a straightforward queuing scheme to reduce hol-blocking in high-performance hybrid networks

Euro-Par'13 Proceedings of the 19th international conference on Parallel Processing
Design space exploration of on-chip ring interconnection for a CPU-GPU heterogeneous architecture

Journal of Parallel and Distributed Computing
Dahu: commodity switches for direct connect data center networks

ANCS '13 Proceedings of the ninth ACM/IEEE symposium on Architectures for networking and communications systems
Optimal networks from error correcting codes

ANCS '13 Proceedings of the ninth ACM/IEEE symposium on Architectures for networking and communications systems
Locality-oblivious cache organization leveraging single-cycle multi-hop NoCs

Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
On the topological properties of HyperX

The Journal of Supercomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Increasing integrated-circuit pin bandwidth has motivateda corresponding increase in the degree or radix of interconnection networksand their routers. This paper introduces the flattened butterfly, a cost-efficient topology for high-radix networks. On benign (load-balanced) traffic, the flattened butterfly approaches the cost/performance of a butterfly network and has roughly half the cost of a comparable performance Clos network.The advantage over the Clos is achieved by eliminating redundant hopswhen they are not needed for load balance. On adversarial traffic, the flattened butterfly matches the cost/performance of a folded-Clos network and provides an order of magnitude better performance than a conventional butterfly.In this case, global adaptive routing is used to switchthe flattened butterfly from minimal to non-minimal routing - usingredundant hops only when they are needed. Minimal and non-minimal, oblivious and adaptive routing algorithms are evaluated on the flattened butterfly.We show that load-balancing adversarial traffic requires non-minimalglobally-adaptive routing and show that sequential allocators are required to avoid transient load imbalance when using adaptive routing algorithms.We also compare the cost of the flattened butterfly to folded-Clos, hypercube,and butterfly networks with identical capacityand show that the flattened butterfly is more cost-efficient thanfolded-Clos and hypercube topologies.