Prefetching and cache management using task lifetimes

  • Authors:
  • Vassilis Papaefstathiou;Manolis G.H. Katevenis;Dimitrios S. Nikolopoulos;Dionisios Pnevmatikatos

  • Affiliations:
  • FORTH-ICS, Heraklion, Crete, Greece;FORTH-ICS, Heraklion, Crete, Greece;Queen's University of Belfast, Belfast, United Kingdom;FORTH-ICS, Heraklion, Crete, Greece

  • Venue:
  • Proceedings of the 27th international ACM conference on International conference on supercomputing
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Task-based dataflow programming models and runtimes emerge as promising candidates for programming multicore and manycore architectures. These programming models analyze dynamically task dependencies at runtime and schedule independent tasks concurrently to the processing elements. In such models, cache locality, which is critical for performance, becomes more challenging in the presence of fine-grain tasks, and in architectures with many simple cores. This paper presents a combined hardware-software approach to improve cache locality and offer better performance is terms of execution time and energy in the memory system. We propose the explicit bulk prefetcher (EBP) and epoch-based cache management (ECM) to help runtimes prefetch task data and guide the replacement decisions in caches. The runtimem software can use this hardware support to expose its internal knowledge about the tasks to the architecture and achieve more efficient task-based execution. Our combined scheme outperforms HW-only prefetchers and state-of-the-art replacement policies, improves performance by an average of 17%, generates on average 26% fewer L2 misses, and consumes on average 28% less energy in the components of the memory system.