Express Cubes: Improving the Performance of k-ary n-cube Interconnection Networks

Authors:
William J. Dally
Affiliations:
-
Venue:
IEEE Transactions on Computers
Year:
1991

Citing 9
Cited 43

The cosmic cube

Communications of the ACM - Special section on computer architecture
Fat-trees: universal networks for hardware-efficient supercomputing

IEEE Transactions on Computers
Multicomputers: Message-Passing Concurrent Computers

Computer
The architecture and programming of the Ametek series 2010 multicomputer

C3P Proceedings of the third conference on Hypercube concurrent computers and applications: Architecture, software, computer systems, and general issues - Volume 1
Performance Analysis of k-ary n-cube Interconnection Networks

IEEE Transactions on Computers
System design of the J-Machine

AUSCRYPT '90 Proceedings of the sixth MIT conference on Advanced research in VLSI
Network and processor architecture for message-driven computers

VLSI and parallel computation
A VLSI Architecture for Concurrent Data Structures

A VLSI Architecture for Concurrent Data Structures
A Framework for Adaptive Routing in Multicomputer Networks

A Framework for Adaptive Routing in Multicomputer Networks

Methods for message routing in parallel machines

STOC '92 Proceedings of the twenty-fourth annual ACM symposium on Theory of computing
A comparison of adaptive wormhole routing algorithms

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
The turn model for adaptive routing

Journal of the ACM (JACM)
Reducing PE/Memory Traffic in Multiprocessors by the Difference Coding of Memory Addresses

IEEE Transactions on Parallel and Distributed Systems
NIFDY: a low overhead, high throughput network interface

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Augmented Ring Networks

IEEE Transactions on Parallel and Distributed Systems
A design space evaluation of grid processor architectures

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Boosting the Performance of Myrinet Networks

IEEE Transactions on Parallel and Distributed Systems
Optimal Architectures and Algorithms for Mesh-Connected Parallel Computers with Separable Row/Column Buses

IEEE Transactions on Parallel and Distributed Systems
Boosting the Performance of Myrinet Networks

IEEE Transactions on Parallel and Distributed Systems
Deadlock- and Livelock-Free Routing Protocols for Wave Switching

IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Bidirectional versus Unidirectional Networks: Cost/Performance Trade-Offs

MASCOTS '95 Proceedings of the 3rd International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems
Analysis of Buffer Design for Adaptive Routing in Direct Networks

MASCOTS '96 Proceedings of the 4th International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems
Improving the Performance of Regular Networks with Source Routing

ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
Power-driven Design of Router Microarchitectures in On-chip Networks

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
A Technology-Aware and Energy-Oriented Topology Exploration for On-Chip Networks

Proceedings of the conference on Design, Automation and Test in Europe - Volume 2
Comparative Modeling of Network Topologies and Routing Strategies in Multicomputers

International Journal of High Performance Computing Applications
On-Chip Communication Architectures: System on Chip Interconnect

On-Chip Communication Architectures: System on Chip Interconnect
Express virtual channels: towards the ideal interconnection fabric

Proceedings of the 34th annual international symposium on Computer architecture
A Hybrid Ring/Mesh Interconnect for Network-on-Chip Using Hierarchical Rings for Global Routing

NOCS '07 Proceedings of the First International Symposium on Networks-on-Chip
MIRA: A Multi-layered On-Chip Interconnect Router Architecture

ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Exploring High-Dimensional Topologies for NoC Design Through an Integrated Analysis and Synthesis Framework

NOCS '08 Proceedings of the Second ACM/IEEE International Symposium on Networks-on-Chip
MC-Sim: an efficient simulation tool for MPSoC designs

Proceedings of the 2008 IEEE/ACM International Conference on Computer-Aided Design
Power reduction of CMP communication networks via RF-interconnects

Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
A k-cube graph construction for mappings from binary vectors to permutations

ISIT'09 Proceedings of the 2009 IEEE international conference on Symposium on Information Theory - Volume 1
An analysis of on-chip interconnection networks for large-scale chip multiprocessors

ACM Transactions on Architecture and Code Optimization (TACO)
Asynchronous Bypass Channels: Improving Performance for Multi-synchronous NoCs

NOCS '10 Proceedings of the 2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip
Physical vs. Virtual Express Topologies with Low-Swing Links for Future Many-Core NoCs

NOCS '10 Proceedings of the 2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip
Area and power-efficient innovative congestion-aware Network-on-Chip architecture

Journal of Systems Architecture: the EUROMICRO Journal
Modeling and evaluation of ring-based interconnects for Network-on-Chip

Journal of Systems Architecture: the EUROMICRO Journal
A power-efficient network on-chip topology

Proceedings of the Fifth International Workshop on Interconnection Network Architecture: On-Chip, Multi-Chip
"It's a small world after all": noc performance optimization via long-range link insertion

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
DART: a programmable architecture for NoC simulation on FPGAs

NOCS '11 Proceedings of the Fifth ACM/IEEE International Symposium on Networks-on-Chip
HPC-Mesh: A Homogeneous Parallel Concentrated Mesh for Fault-Tolerance and Energy Savings

Proceedings of the 2011 ACM/IEEE Seventh Symposium on Architectures for Networking and Communications Systems
Design and evaluation of low latency interconnection networks for real-time many-core embedded systems

Computers and Electrical Engineering
A simple and efficient input selection function for networks-on-chip

ICDCN'12 Proceedings of the 13th international conference on Distributed Computing and Networking
Exploiting communication and packaging locality for cost-effective large scale networks

Proceedings of the 26th ACM international conference on Supercomputing
High-throughput differentiated service provision router architecture for wireless network-on-chip

International Journal of High Performance Systems Architecture
Design and evaluation of Mesh-of-Tree based Network-on-Chip using virtual channel router

Microprocessors & Microsystems
A load-balanced congestion-aware wireless network-on-chip design for multi-core platforms

Microprocessors & Microsystems
40.4fJ/bit/mm low-swing on-chip signaling with self-resetting logic repeaters embedded within a mesh NoC in 45nm SOI CMOS

Proceedings of the Conference on Design, Automation and Test in Europe
Analytical performance modeling of shuffle-exchange inspired mesh-based Network-on-Chips

Performance Evaluation
An Analysis of Reducing Communication Delay in Network-on-Chip Interconnect Architecture

Wireless Personal Communications: An International Journal

Quantified Score

Hi-index	14.98

Visualization

Abstract

The author discusses express cubes, k-ary n-cube interconnection networks augmented by express channels that provide a short path for nonlocal messages. An express cube combines the logarithmic diameter of a multistage network with the wire-efficiency and ability to exploit locality of a low-dimensional mesh network. The insertion of express channels reduces the network diameter and thus the distance component of network latency. Wire length is increased, allowing networks to operate with latencies that approach the physical speed-of-light limitation rather than being limited by node delays. Express channels increase wire bisection in a manner that allows the bisection to be controlled independently of the choice of radix, dimension, and channel width. By increasing wire bisection to saturate the available wiring media, throughput can be substantially increased. With an express cube both latency and throughput are wire-limited and within a small factor of the physical limit on performance.