Best of Both Latency and Throughput

  • Authors:
  • Ed Grochowski;Ronny Ronen;John Shen;Hong Wang

  • Affiliations:
  • Intel Labs, Santa Clara, CA;Intel Israel;Intel Labs, Santa Clara, CA;Intel Labs, Santa Clara, CA

  • Venue:
  • ICCD '04 Proceedings of the IEEE International Conference on Computer Design
  • Year:
  • 2004

Quantified Score

Hi-index 0.02

Visualization

Abstract

This paper describes the tradeoff between latency performance and throughput performance in a power-constrained environment. We show that the key to achieving both excellent latency performance as well as excellent throughput performance is to dynamically vary the amount of energy expended to process instructions according to the amount of parallelism available in the software. We survey four techniques for achieving variable energy per instruction: voltage/frequency scaling, asymmetric cores, variable-size cores, and speculation control. We estimate the potential range of energies obtainable by each technique and conclude that a combination of asymmetric cores and voltage/frequency scaling offers the most promising approach to designing a chip-level multiprocessor that can achieve both excellent latency performance and excellent throughput performance.