Off-loading application controlled data prefetching in numerical codes for multi-core processors

  • Authors:
  • J. Weidendorfer;C. Trinitis

  • Affiliations:
  • Institut fur Informatik, Technische Universitat Munchen, D-85747 Garching bei Munchen, Germany.;Institut fur Informatik, Technische Universitat Munchen, D-85747 Garching bei Munchen, Germany

  • Venue:
  • International Journal of Computational Science and Engineering
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

An important issue when designing numerical code in HighPerformance Computing is cache optimisation in order to exploit theperformance potential of a given target architecture. This includestechniques to improve memory access locality as well asprefetching. Inherent algorithm constrains often limit the firstapproach, which typically uses a blocking technique. While thereexist automatic prefetching mechanisms in hardware and/orcompilers, they can not complement blocking with additionalprefetching. We provide an infrastructure for off-loadingapplication controlled prefetching on a chip multiprocessor,allowing to further improve numerical code already optimised bystandard cache optimisation. Clear benefits are shown for realworkloads on existing hardware.