An interleaved cache clustered VLIW processor

  • Authors:
  • Enric Gibert;Jesús Sánchez;Antonio González

  • Affiliations:
  • Universitat Politècnica de Catalunya, Barcelona - SPAIN;Universitat Politècnica de Catalunya, Barcelona - SPAIN;Universitat Politècnica de Catalunya, Barcelona - SPAIN

  • Venue:
  • ICS '02 Proceedings of the 16th international conference on Supercomputing
  • Year:
  • 2002

Quantified Score

Hi-index 0.01

Visualization

Abstract

Clustered microarchitectures are becoming a common organiza驴tion due to their potential to reduce the penalties caused by wire delays and power consumption. Fully-distributed architectures are particularly effective to deal with these constraints, and besides they are very scalable. However, the distribution of the data cache memory poses a significant challenge and may be crit驴ical for performance. In this work, a distributed data cache VLIW architecture based on an interleaved cache organization along with cyclic scheduling techniques are proposed. Moreover, the use of Attraction Buffers for such an architecture is introduced. Attraction Buffers are a novel hardware mechanism to increase the percentage of local accesses. The idea is to allow the move驴ment of some data towards the clusters that need it.Performance results for 9 Mediabench benchmarks show that our scheduling techniques are able to hide the increased mem驴ory latency when accessing data mapped in a remote cluster. In addition, the local hit ratio is increased by 15% and stall time is reduced by 30% when using the same scheduling techniques with an interleaved cache clustered processor with Attraction Buffers. Finally, the proposed architecture is compared with a state-of-the-art distributed architecture such as the multiVLIW. Results show that the performance of an interleaved cache clustered VLIW pro驴cessor with Attraction Buffers is similar to that of the multiVLIW architecture, whereas the former has a lower hardware complex驴ity.