Fair CPU time accounting in CMP+SMT processors

  • Authors:
  • Carlos Luque;Miquel Moreto;Francisco J. Cazorla;Mateo Valero

  • Affiliations:
  • Universistat Politècnica de Catalunya, and Barcelona Supercomputing Center, Barcelona, Spain;International Computer Science Institute, Universistat Politècnica de Catalunya, and Barcelona Supercomputing Center, Berkeley, CA;Barcelona Supercomputing Center, and Spanish National Research Council (IIIA-CSIC), Barcelona, Spain;Universistat Politècnica de Catalunya, and Barcelona Supercomputing Center), Barcelona, Spain

  • Venue:
  • ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Processor architectures combining several paradigms of Thread-Level Parallelism (TLP), such as CMP processors in which each core is SMT, are becoming more and more popular as a way to improve performance at a moderate cost. However, the complex interaction between running tasks in hardware shared resources in multi-TLP architectures introduces complexities when accounting CPU time (or CPU utilization) to tasks. The CPU utilization accounted to a task depends on both the time it runs in the processor and the amount of processor hardware resources it receives. Deploying systems with accurate CPU accounting mechanisms is necessary to increase fairness. Moreover, it will allow users to be fairly charged on a shared data center, facilitating server consolidation in future systems. In this article we analyze the accuracy and hardware cost of previous CPU accounting mechanisms for pure-CMP and pure-SMT processors and we show that they are not adequate for CMP+SMT processors. Consequently, we propose a new accounting mechanism for CMP+SMT processors which: (1) increases the accuracy of accounted CPU utilization; (2) provides much more stable results over a wide range of processor setups; and (3) does not require tracking all hardware shared resources, significantly reducing its implementation cost. In particular, previous proposals lead to inaccuracies between 21% and 79% when measuring CPU utilization in an 8-core 2-way SMT processor, while our proposal reduces this inaccuracy to less than 5.0%.