TurboTag: lookup filtering to reduce coherence directory power

  • Authors:
  • Pejman Lotfi-Kamran;Michael Ferdman;Daniel Crisan;Babak Falsafi

  • Affiliations:
  • École publique Polytechnique Fédérale de Lausanne, Lausanne, Switzerland;Carnegie Mellon University/École publique Polytechnique Fédérale de Lausanne, Pittsburgh/Lausanne, USA;École publique Polytechnique Fédérale de Lausanne, Lausanne, Switzerland;École publique Polytechnique Fédérale de Lausanne, Lausanne, Switzerland

  • Venue:
  • Proceedings of the 16th ACM/IEEE international symposium on Low power electronics and design
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

On-chip coherence directories of today's multi-core systems are not energy efficient. Coherence directories dissipate a significant fraction of their power on unnecessary lookups when running commercial server and scientific workloads. These workloads have large working sets that are beyond the reach of on-chip caches of modern processors. Limited to capturing a small part of the working set, private caches retain cache blocks only for a short period of time before replacing them with new blocks. Moreover, coherence enforcement is a known performance bottleneck of multi-threaded software, hence data-sharing in optimized high performance software is minimal. Consequently, the majority of the accesses to the coherence directory find no sharers in the directory because the data are not available in the on-chip private caches, effectively wasting power on the coherence checks. To improve energy-efficiency for future many-core systems, we propose TurboTag, a filtering mechanism to eliminate needless directory lookups. We analyze full-system traces of server and scientific workloads and find that over 69% of accesses to the directory find no sharers and can be entirely avoided. Taking advantage of this behavior, TurboTag achieves a 58% reduction in the directory's dynamic power consumption.