Instruction Replication for Clustered Microarchitectures

  • Authors:
  • Alex Aletà;Josep M. Codina;Antonio González;David Kaeli

  • Affiliations:
  • Dep. of Computer Architecture, UPC, Barcelona, Spain;Dep. of Computer Architecture, UPC, Barcelona, Spain;Dep. of Computer Architecture, UPC, Barcelona, Spain and Intel Barcelona Research Center, Intel Labs, UPC, Barcelona, Spain;Northeastern University, Boston, MA

  • Venue:
  • Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

This work presents a new compilation technique that usesinstruction replication in order to reduce the number ofcommunications executed on a clusteredmicroarchitecture. For such architectures, the need tocommunicate values between clusters can result in asignificant performance loss. Inter-clustercommunications can be reduced by selectively replicatingan appropriate set of instructions. However, instructionreplication must be done carefully since it may alsodegrade performance due to the increased contention itcan place on processor resources. The proposed schemeis built on top of a previously proposed state-of-the-artmodulo scheduling algorithm that effectively reducescommunications. Results show that the number ofcommunications can decrease using replication, whichresults in significant speed-ups. IPC is increased by 25%on average for a 4-cluster microarchitecture and by asmuch as 70% for selected programs.