L1-bandwidth aware thread allocation in multicore SMT processors

  • Authors:
  • Josué Feliu;Julio Sahuquillo;Salvador Petit;José Duato

  • Affiliations:
  • Universitat Politècnica de València, València, Spain;Universitat Politècnica de València, València, Spain;Universitat Politècnica de València, València, Spain;Universitat Politècnica de València, València, Spain

  • Venue:
  • PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Improving the utilization of shared resources is a key issue to increase performance in SMT processors. Recent work has focused on resource sharing policies to enhance the processor performance, but their proposals mainly concentrate on novel hardware mechanisms that adapt to the dynamic resource requirements of the running threads. This work addresses the L1 cache bandwidth problem in SMT processors experimentally on real hardware. Unlike previous work, this paper concentrates on thread allocation, by selecting the proper pair of co-runners to be launched to the same core. The relation between L1 bandwidth requirements of each benchmark and its performance (IPC) is analyzed. We found that for individual benchmarks, performance is strongly connected to L1 bandwidth consumption, and this observation remains valid when several co-runners are launched to the same SMT core. Based on these findings we propose two L1 bandwidth aware thread to core (t2c) allocation policies, namely Static and Dynamic t2c allocation, respectively. The aim of these policies is to properly balance L1 bandwidth requirements of the running threads among the processor cores. Experiments on a Xeon E5645 processor show that the proposed policies significantly improve the performance of the Linux OS kernel regardless the number of cores considered.