Over-provisioned multicore systems

  • Authors:
  • Gurindar S. Sohi;Koushik Chakraborty

  • Affiliations:
  • The University of Wisconsin - Madison;The University of Wisconsin - Madison

  • Venue:
  • Over-provisioned multicore systems
  • Year:
  • 2008

Quantified Score

Hi-index 0.02

Visualization

Abstract

Technology scaling has provided system designers with an exploding transistor budget, far more than what was available when the core principles behind many existing commodity microprocessors were envisioned. This tremendous growth also brings forth a whole new set of engineering challenges involving power density, thermal efficiency, and so on. In particular, the power constraint is becoming a first order design consideration in microprocessor designs. In the landscape of general purpose processors, power limited designs designate a significant paradigm shift from the area limited designs of the past. This dissertation proposes a model to capture the first order impact of the power constraint. Denoted as the Simultaneously Active Fraction (SAF), this metric represents the fraction of the entire chip resources that can be active simultaneously, given a target power envelope. As the improvement in the energy efficiency of individual transistor devices lags behind the growth in their integration capacity, the dissertation finds that the SAF is monotonically decreasing in each successive technology generation. In the context of rapidly shrinking SAF, this dissertation investigates a novel multicore design paradigm: Over-provisioned Multicore System (OPMS). An OPMS is a class of multicores that by design provision more processing core resources than that can be kept active for their target Thermal Design Power (TDP). Since only a subset of the on-chip cores are active at any given time, this design paradigm affords tremendous flexibility in assigning computation on processing cores, facilitating many novel techniques in this broad framework. To demonstrate a concrete application of this framework, the dissertation proposes Computation Spreading (CSP): a new model for distributing the collective work from multithreaded applications. CSP aims to collocate similar computation fragments from different threads on the same core, while distributing dissimilar computation fragments from the same thread across multiple cores. Under CSP, on-chip cores in an OPMS are dynamically specialized via retaining mutually exclusive predictive states. The dissertation demonstrates the effectiveness of CSP in an OPMS through a rigorous evaluation of performance, energy efficiency, and several design trade-offs.