Modeling Cache Contention and Throughput of Multiprogrammed Manycore Processors

  • Authors:
  • Xi E. Chen;Tor Aamodt

  • Affiliations:
  • NVIDIA, Portland;University of British Columbia, Vancouver

  • Venue:
  • IEEE Transactions on Computers
  • Year:
  • 2012

Quantified Score

Hi-index 14.98

Visualization

Abstract

This paper proposes an analytical model for accurately predicting the impact of contention on cache miss rates. The focus is multiprogrammed workloads running on multithreaded manycore architectures. This work addresses a key challenge facing earlier cache contention models as the number of concurrent threads exceeds the associativity of shared caches. The memory access characteristics of individual applications are obtained in isolation by profiling their circular sequences and two new measures of access locality are proposed. An evaluation of this model in the context of a Niagara processor shows that it achieves an average 8.7 percent error in miss rate predictions which improves upon the best prior model by 48.1x. This paper also presents a novel Markov chain throughput model. When combining the contention model with the Markov chain model, throughput is estimated with an average error of 8.3 percent compared to detailed simulation. Moreover, the combined model tracks throughput sufficiently well to find the same optimized design point for application-specific workloads 65 times faster than detailed simulation. This paper also shows that the models accurately predict cache contention and throughput trends across various workloads on real hardware.