Hardware spatial forwarding for widely shared data

  • Authors:
  • Marius Pirvu;Laxmi Bhuyan

  • Affiliations:
  • Department of Computer Science, Texas A&M University, College Station, TX;Department of Computer Science, Texas A&M University, College Station, TX

  • Venue:
  • Proceedings of the 14th international conference on Supercomputing
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

Applications with widely shared data do not perform well on cc-NUMA multiprocessors due to the hot-spots they create in the system. In this paper we address this problem by enhancing the memory controller with a forwarding mechanism capable of hiding the read latency of widely shared data, while potentially decreasing the memory and network contention. Based on the influx of requests, the memory anticipates the next read references and forwards the data in advance to the processors. To identify the set of processors the data is to be forwarded to we use a heuristic based on the spatial locality of memory blocks. To increase the forwarding effectiveness and minimize the number of messages, we incorporate simple filters combined with a feedback mechanism. We also show that further improvements are possible using a combined software-prefetching/hardware-forwarding approach. Our experimental results obtained with a detailed execution driven simulator with ILP processors show significant improvements in execution time (up to 37%).