Interconnection networks for large-scale parallel processing: theory and case studies (2nd ed.)
Interconnection networks for large-scale parallel processing: theory and case studies (2nd ed.)
Demystifying 3D ICs: The Pros and Cons of Going Vertical
IEEE Design & Test
Design tradeoffs for tiled CMP on-chip networks
Proceedings of the 20th annual international conference on Supercomputing
Misleading Performance Reporting in the Supercomputing Field
Scientific Programming
Hi-index | 0.01 |
In this paper, we describe a scalable interconnection network architecture intended for very large multicore processors implemented on stacked chip 3D integrated circuits (3D-IC). These networks provide fully interconnected, low latency, single hop performance with wiring complexity that scales linearly with the size of the network. The enabling technology for these networks is a novel, fully distributed arbitration and control algorithm that operates solely at the edges of the network without the need for any routers within the network core. This paper is focused on a description of that algorithm. We present simulation results for average, worst-case, and per-node latencies showing that our arbitration algorithm performs efficiently, scales for a wide range of partition sizes, and effectively manages highly non-uniform traffic patterns.