Fat-trees: universal networks for hardware-efficient supercomputing
IEEE Transactions on Computers
Performance Analysis of k-ary n-cube Interconnection Networks
IEEE Transactions on Computers
The SGI Origin: a ccNUMA highly scalable server
Proceedings of the 24th annual international symposium on Computer architecture
Journal of the ACM (JACM)
IEEE Transactions on Parallel and Distributed Systems
Dynamic Voltage Scaling with Links for Power Optimization of Interconnection Networks
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Power constrained design of multiprocessor interconnection networks
ICCD '97 Proceedings of the 1997 International Conference on Computer Design (ICCD '97)
GOAL: a load-balanced adaptive routing algorithm for torus networks
Proceedings of the 30th annual international symposium on Computer architecture
Energy optimization techniques in cluster interconnects
Proceedings of the 2003 international symposium on Low power electronics and design
Power-driven Design of Router Microarchitectures in On-chip Networks
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Principles and Practices of Interconnection Networks
Principles and Practices of Interconnection Networks
Design-Space Exploration of Power-Aware On/Off Interconnection Networks
ICCD '04 Proceedings of the IEEE International Conference on Computer Design
Microarchitecture of a High-Radix Router
Proceedings of the 32nd annual international symposium on Computer Architecture
The BlackWidow High-Radix Clos Network
Proceedings of the 33rd annual international symposium on Computer Architecture
Adaptive routing in high-radix clos network
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Technology-Driven, Highly-Scalable Dragonfly Topology
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
ARC '09 Proceedings of the 5th International Workshop on Reconfigurable Computing: Architectures, Tools and Applications
Indirect adaptive routing on large scale interconnection networks
Proceedings of the 36th annual international symposium on Computer architecture
HyperX: topology, routing, and packaging of efficient large-scale networks
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Outstanding research problems in NoC design: system, microarchitecture, and circuit perspectives
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
On-chip bidirectional wiring for heavily pipelined systems using network coding
ICCD'09 Proceedings of the 2009 IEEE international conference on Computer design
Energy proportional datacenter networks
Proceedings of the 37th annual international symposium on Computer architecture
Evaluating Bufferless Flow Control for On-chip Networks
NOCS '10 Proceedings of the 2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip
Power-Efficient and High-Performance Multi-level Hybrid Nanophotonic Interconnect for Multicores
NOCS '10 Proceedings of the 2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip
Fiber optic communication technologies: what's needed for datacenter network operations
IEEE Communications Magazine
Axon: a flexible substrate for source-routed ethernet
Proceedings of the 6th ACM/IEEE Symposium on Architectures for Networking and Communications Systems
DOS: a scalable optical switch for datacenters
Proceedings of the 6th ACM/IEEE Symposium on Architectures for Networking and Communications Systems
Design of a scalable nanophotonic interconnect for future multicores
Proceedings of the 5th ACM/IEEE Symposium on Architectures for Networking and Communications Systems
On-Chip Network Evaluation Framework
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
LEGUP: using heterogeneity to reduce the cost of data center network upgrades
Proceedings of the 6th International COnference
A practical low-latency router architecture with wing channel for on-chip network
Microprocessors & Microsystems
Scalable and cost-effective interconnection of data-center servers using dual server ports
IEEE/ACM Transactions on Networking (TON)
The Journal of Supercomputing
A case for heterogeneous on-chip interconnects for CMPs
Proceedings of the 38th annual international symposium on Computer architecture
The role of optics in future high radix switch design
Proceedings of the 38th annual international symposium on Computer architecture
2-Dilated flattened butterfly: A nonblocking switching topology for high-radix networks
Computer Communications
A Scalability Study of Enterprise Network Architectures
Proceedings of the 2011 ACM/IEEE Seventh Symposium on Architectures for Networking and Communications Systems
Packet chaining: efficient single-cycle allocation for on-chip networks
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Applying traffic merging to datacenter networks
Proceedings of the 3rd International Conference on Future Energy Systems: Where Energy, Computing and Communication Meet
CloudRAMSort: fast and efficient large-scale distributed RAM sort on shared-nothing cluster
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Exploiting communication and packaging locality for cost-effective large scale networks
Proceedings of the 26th ACM international conference on Supercomputing
A case for random shortcut topologies for HPC interconnects
Proceedings of the 39th Annual International Symposium on Computer Architecture
Proceedings of the 39th Annual International Symposium on Computer Architecture
PAST: scalable ethernet for data centers
Proceedings of the 8th international conference on Emerging networking experiments and technologies
International Journal of Distributed Systems and Technologies
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Research on half-mesh topology based on binary model and HTF-XY routing algorithm
International Journal of Computer Applications in Technology
Scalable high-radix router microarchitecture using a network switch organization
ACM Transactions on Architecture and Code Optimization (TACO)
Memory-centric system interconnect design with hybrid memory cubes
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
BBQ: a straightforward queuing scheme to reduce hol-blocking in high-performance hybrid networks
Euro-Par'13 Proceedings of the 19th international conference on Parallel Processing
Design space exploration of on-chip ring interconnection for a CPU-GPU heterogeneous architecture
Journal of Parallel and Distributed Computing
Dahu: commodity switches for direct connect data center networks
ANCS '13 Proceedings of the ninth ACM/IEEE symposium on Architectures for networking and communications systems
Optimal networks from error correcting codes
ANCS '13 Proceedings of the ninth ACM/IEEE symposium on Architectures for networking and communications systems
Locality-oblivious cache organization leveraging single-cycle multi-hop NoCs
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
On the topological properties of HyperX
The Journal of Supercomputing
Hi-index | 0.00 |
Increasing integrated-circuit pin bandwidth has motivateda corresponding increase in the degree or radix of interconnection networksand their routers. This paper introduces the flattened butterfly, a cost-efficient topology for high-radix networks. On benign (load-balanced) traffic, the flattened butterfly approaches the cost/performance of a butterfly network and has roughly half the cost of a comparable performance Clos network.The advantage over the Clos is achieved by eliminating redundant hopswhen they are not needed for load balance. On adversarial traffic, the flattened butterfly matches the cost/performance of a folded-Clos network and provides an order of magnitude better performance than a conventional butterfly.In this case, global adaptive routing is used to switchthe flattened butterfly from minimal to non-minimal routing - usingredundant hops only when they are needed. Minimal and non-minimal, oblivious and adaptive routing algorithms are evaluated on the flattened butterfly.We show that load-balancing adversarial traffic requires non-minimalglobally-adaptive routing and show that sequential allocators are required to avoid transient load imbalance when using adaptive routing algorithms.We also compare the cost of the flattened butterfly to folded-Clos, hypercube,and butterfly networks with identical capacityand show that the flattened butterfly is more cost-efficient thanfolded-Clos and hypercube topologies.