Efficient emulation of hardware prefetchers via event-driven helper threading

  • Authors:
  • Ilya Ganusov;Martin Burtscher

  • Affiliations:
  • Cornell University, Ithaca, New York;Cornell University, Ithaca, New York

  • Venue:
  • Proceedings of the 15th international conference on Parallel architectures and compilation techniques
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

The advance of multi-core architectures provides significant benefits for parallel and throughput-oriented computing, but the performance of individual computation threads does not improve and may even suffer a penalty because of the increased contention for shared resources. This paper explores the idea of using available general-purpose cores in a CMP as helper engines for individual threads running on the active cores. We propose a lightweight architectural framework for efficient event-driven software emulation of complex hardware accelerators and describe how this framework can be applied to implement a variety of prefetching techniques. We demonstrate the viability and effectiveness of our framework on a wide range of applications from the SPEC CPU2000 and Olden benchmark suites. On average, our mechanism provides performance benefits within 5% of pure hardware implementations. Furthermore, we demonstrate that running event-driven prefetching threads on top of a baseline with a hardware stride prefetcher yields significant speedups for many programs. Finally, we show that our approach provides competitive performance improvements over other hardware approaches for multi-core execution while executing fewer instructions and requiring considerably less hardware support.