Understanding PARSEC performance on contemporary CMPs

  • Authors:
  • Major Bhadauria;Vincent M. Weaver;Sally A. McKee

  • Affiliations:
  • Cornell University, USA;Cornell University, USA;Chalmers University of Technology, Sweden

  • Venue:
  • IISWC '09 Proceedings of the 2009 IEEE International Symposium on Workload Characterization (IISWC)
  • Year:
  • 2009

Quantified Score

Hi-index 0.02

Visualization

Abstract

PARSEC is a reference application suite used in industry and academia to assess new Chip Multiprocessor (CMP) designs. No investigation to date has profiled PARSEC on real hardware to better understand scaling properties and bottlenecks. This understanding is crucial in guiding future CMP designs for these kinds of emerging workloads. We use hardware performance counters, taking a systems-level approach and varying common architectural parameters: number of out-of-order cores, memory hierarchy configurations, number of multiple simultaneous threads, number of memory channels, and processor frequencies. We find these programs to be largely compute-bound, and thus limited by number of cores, micro-architectural resources, and cache-to-cache transfers, rather than by off-chip memory or system bus bandwidth. Half the suite fails to scale linearly with increasing number of threads, and some applications saturate performance at few threads on all platforms tested. Exploiting thread level parallelism delivers greater payoffs than exploiting instruction level parallelism. To reduce power and improve performance, we recommend increasing the number of arithmetic units per core, increasing support for TLP, and reducing support for ILP.