A Quantitative Study of the On-Chip Network and Memory Hierarchy Design for Many-Core Processor

  • Authors:
  • Xu Wang;Ge Gan;Joseph Manzano;Dongrui Fan;Shuxu Guo

  • Affiliations:
  • -;-;-;-;-

  • Venue:
  • ICPADS '08 Proceedings of the 2008 14th IEEE International Conference on Parallel and Distributed Systems
  • Year:
  • 2008

Quantified Score

Hi-index 0.01

Visualization

Abstract

In this paper, we will study the on-chip network and memory hierarchy design of the Godson-T - a homogeneous many-core processor. Godson-T has 64 cores (with private L1 cache), and 16 global L2 cache banks. All these on-chip units are connected by a 2D $8\times8$ mesh network. Our study reveals that:(a) Global on-chip L2 cache can effectively alleviate the memory pressure caused by the data-thirsty on-chip computing engines. However, its potential is still limited by both the off-chip and the in-chip bandwidth, especially when increasing the number of active threads.(b) On-chip traffic congestion is largely caused by the intensive memory access requests issued from the on-chipcores. Therefore, the design of the on-chip network must consider the available performance of the datapath that connects the processor to the main memory. (c) In theory, different applications have different communication patterns (Berkeley's view [1]). However, the application's runtime communication pattern is only determined by the design of the underlying memory hierarchy and on-chip interconnection. These conclusions are generally applicable to a wide variety of many-core processors with similar design.