No cache-coherence: a single-cycle ring interconnection for multi-core L1-NUCA sharing on 3D chips

  • Authors:
  • Shu-Hsuan Chou;Chien-Chih Chen;Chi-Neng Wen;Yi-Chao Chan;Tien-Fu Chen;Chao-Ching Wang;Jinn-Shyan Wang

  • Affiliations:
  • National Chung Cheng University, Taiwan, R.O.C.;National Chung Cheng University, Taiwan, R.O.C.;National Chung Cheng University, Taiwan, R.O.C.;National Chung Cheng University, Taiwan, R.O.C.;National Chung Cheng University, Taiwan, R.O.C.;National Chung Cheng University, Taiwan, R.O.C.;National Chung Cheng University, Taiwan, R.O.C.

  • Venue:
  • Proceedings of the 46th Annual Design Automation Conference
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Consistent with the trend towards the use of many cores in SOC and 3D Chip techniques, this paper proposes a "single-cycle ring" interconnection (SC_Ring) with ultra-low latency and minimal complexity. The proposed SC_Ring allows multiple single-cycle transactions in parallel. The main features of the circuit-switched design include a set of 3-ported circuit-switched routers (4~16) and a performance/timing effective arbiter. The arbiter, called "BTPC", features single-cycle arbitration and routing-control by means of the novel Binary-Tree paths convergence and path-prediction mechanisms, to provide a highly reduced time complexity. By combining this with the integration of 3D chips, the proposed ring-based interconnection offers several advantages for hierarchical clustering in future many-core systems, in terms of cost, latency, and power reductions. Moreover, based on the proposed SC_Ring, this work realizes a "level-1 non-uniform cache architecture" (L1-NUCA) for fast data communication without cache-coherency in facilitating multithreading/multi-core as a case study. Finally, experimental results show that our approach yields promising performance.