Characterizing the resource-sharing levels in the UltraSPARC T2 processor

  • Authors:
  • Vladimir Čakarević;Petar Radojković;Javier Verdú;Alex Pajuelo;Francisco J. Cazorla;Mario Nemirovsky;Mateo Valero

  • Affiliations:
  • Barcelona Supercomputing Center (BSC);Barcelona Supercomputing Center (BSC);Universitat Politècnica de Catalunya (UPC);Universitat Politècnica de Catalunya (UPC);Barcelona Supercomputing Center (BSC);Barcelona Supercomputing Center (BSC) and ICREA Research Professor;Barcelona Supercomputing Center (BSC) and Universitat Politècnica de Catalunya (UPC)

  • Venue:
  • Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Thread level parallelism (TLP) has become a popular trend to improve processor performance, overcoming the limitations of extracting instruction level parallelism. Each TLP paradigm, such as Simultaneous Multithreading or Chip-Multiprocessors, provides different benefits, which has motivated processor vendors to combine several TLP paradigms in each chip design. Even if most of these combined-TLP designs are homogeneous, they present different levels of hardware resource sharing, which introduces complexities on the operating system scheduling and load balancing. Commonly, processor designs provide two levels of resource sharing: Inter-core in which only the highest levels of the cache hierarchy are shared, and Intra-core in which most of the hardware resources of the core are shared. Recently, Sun Microsystems has released the UltraSPARC T2, a processor with three levels of hardware resource sharing: InterCore, IntraCore, and IntraPipe. In this work, we provide the first characterization of a three-level resource sharing processor, the UltraSPARC T2, and we show how multi-level resource sharing affects the operating system design. We further identify the most critical hardware resources in the T2 and the characteristics of applications that are not sensitive to resource sharing. Finally, we present a case study in which we run a real multithreaded network application, showing that a resource sharing aware scheduler can improve the system throughput up to 55%.