L1 Collective Cache: Managing Shared Data for Chip Multiprocessors

  • Authors:
  • Guanjun Jiang;Degui Fen;Liangliang Tong;Lingxiang Xiang;Chao Wang;Tianzhou Chen

  • Affiliations:
  • College of Computer Science, Zhejiang University, China and Department of Computer Science, Hongkong University, China;College of Computer Science, Zhejiang University, China and Department of Computer Science, Hongkong University, China;College of Computer Science, Zhejiang University, China and Department of Computer Science, Hongkong University, China;College of Computer Science, Zhejiang University, China and Department of Computer Science, Hongkong University, China;College of Computer Science, Zhejiang University, China and Department of Computer Science, Hongkong University, China;College of Computer Science, Zhejiang University, China and Department of Computer Science, Hongkong University, China

  • Venue:
  • APPT '09 Proceedings of the 8th International Symposium on Advanced Parallel Processing Technologies
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In recent years, with the possible end of further improvements in single processor, more and more researchers shift to the idea of Chip Multiprocessors (CMPs). The burgeoning of multi-thread programs brings on dramatically increased inter-core communication. Unfortunately, traditional architectures fail to meet the challenge, as they conduct such a kind of communication on the last level of on-chip cache or even on the memory.This paper proposes a novel approach, called Collective Cache, to differentiate the access to shared/private data and handle data communication on the first level cache. In the proposed cache architecture, the share data found in the last level cache are moved into the Collective Cache, a L1 cache structure shared by all cores. We show that the mechanism this paper proposed can immensely enhance inter-processors communication, increase the usage efficiency of L1 cache and simplify data consistency protocol. Extensive analysis of this approach with Simics shows that it can reduce the L1 cache miss rate by 3.36%.