Hiding Communication Delays in Clustered Microarchitectures

  • Authors:
  • Robert J. LaDuca;Joseph Sharkey;Dmitry V. Ponomarev

  • Affiliations:
  • -;-;-

  • Venue:
  • SBAC-PAD '08 Proceedings of the 2008 20th International Symposium on Computer Architecture and High Performance Computing
  • Year:
  • 2008

Quantified Score

Hi-index 0.01

Visualization

Abstract

Clustered micro architectures represent a viable solution for addressing wire delays in communication-bound architectures by partitioning monolithic data path structures into smaller components. While supporting high frequencies, clustered processors usually degrade the instruction throughput due to the inter-cluster communication delays and non-balanced workload distribution. In this paper, we propose and evaluate novel instruction steering policies to reduce or eliminate cross-cluster communication delays while respecting workload balance. Our first technique hides the inter-cluster communication latencies by examining operand readiness information. The proposed policy steers instructions with two register sources to the cluster predicted to generate the last-produced operand. While the later-produced operand is being generated, the transport of the early-produced operand can occur in parallel, hiding the communication delay. Our second technique steers an entire group of instructions co-renamed in a cycle to the same cluster if the number of intra-group register dependencies exceed a threshold. This is done in a round-robin fashion in order to reduce impact on workload balancing.