Organizing the Last Line of Defense before Hitting the Memory Wall for CMPs

  • Authors:
  • Chun Liu;Anand Sivasubramaniam;Mahmut Kandemir

  • Affiliations:
  • Pennsylvania State University;Pennsylvania State University;Pennsylvania State University

  • Venue:
  • HPCA '04 Proceedings of the 10th International Symposium on High Performance Computer Architecture
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

The last line of defense in the cache hierarchy before going to off-chip memory is very critical in chip multiprocessors (CMPs) from both the performance and power perspectives. This paper investigates different organizations for this last line of defense (assumed to be L2 in this paper) towards reducing off-chip memory accesses. We evaluate the trade-offs between private L2 and address-interleaved shared L2 designs, noting their individual benefits and drawbacks. The possible imbalance between the L2 demands across the CPUs favors a shared L2 organization, while the interference between these demands can favor a private L2 organization. We propose a new architecture, called Shared Processor-Based Split L2, that captures the benefits of these two organizations, while avoiding many of their drawbacks. Using several applications from the SPEC OMP suite and a commercial benchmark, Specjbb, on a complete system simulator, we demonstrate the benefits of this shared processor-based L2 organization. Our results show as much as 42.50% improvement in IPC over the private organization (with 11.52% on the average), and as much as 42.22% improvement over the shared interleaved organization (with 9.76% on the average).