L1 Collective Cache: Managing Shared Data for Chip Multiprocessors

Authors:
Guanjun Jiang;Degui Fen;Liangliang Tong;Lingxiang Xiang;Chao Wang;Tianzhou Chen
Affiliations:
College of Computer Science, Zhejiang University, China and Department of Computer Science, Hongkong University, China;College of Computer Science, Zhejiang University, China and Department of Computer Science, Hongkong University, China;College of Computer Science, Zhejiang University, China and Department of Computer Science, Hongkong University, China;College of Computer Science, Zhejiang University, China and Department of Computer Science, Hongkong University, China;College of Computer Science, Zhejiang University, China and Department of Computer Science, Hongkong University, China;College of Computer Science, Zhejiang University, China and Department of Computer Science, Hongkong University, China
Venue:
APPT '09 Proceedings of the 8th International Symposium on Advanced Parallel Processing Technologies
Year:
2009

Citing 12
Cited 0

An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches

Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Managing Wire Delay in Large Chip-Multiprocessor Caches

Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Montecito: A Dual-Core, Dual-Thread Itanium Processor

IEEE Micro
Optimizing Replication, Communication, and Capacity Allocation in CMPs

Proceedings of the 32nd annual international symposium on Computer Architecture
Organizing the Last Line of Defense before Hitting the Memory Wall for CMPs

HPCA '04 Proceedings of the 10th International Symposium on High Performance Computer Architecture
Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset

ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
Cooperative Caching for Chip Multiprocessors

Proceedings of the 33rd annual international symposium on Computer Architecture
POWER5 System microarchitecture

IBM Journal of Research and Development - POWER5 and packaging
Design space exploration for multicore architectures: a power/performance/thermal view

Proceedings of the 20th annual international conference on Supercomputing
ASR: Adaptive Selective Replication for CMP Caches

Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Comparing memory systems for chip multiprocessors

Proceedings of the 34th annual international symposium on Computer architecture
Adaptive set pinning: managing shared caches in chip multiprocessors

Proceedings of the 13th international conference on Architectural support for programming languages and operating systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

In recent years, with the possible end of further improvements in single processor, more and more researchers shift to the idea of Chip Multiprocessors (CMPs). The burgeoning of multi-thread programs brings on dramatically increased inter-core communication. Unfortunately, traditional architectures fail to meet the challenge, as they conduct such a kind of communication on the last level of on-chip cache or even on the memory.This paper proposes a novel approach, called Collective Cache, to differentiate the access to shared/private data and handle data communication on the first level cache. In the proposed cache architecture, the share data found in the last level cache are moved into the Collective Cache, a L1 cache structure shared by all cores. We show that the mechanism this paper proposed can immensely enhance inter-processors communication, increase the usage efficiency of L1 cache and simplify data consistency protocol. Extensive analysis of this approach with Simics shows that it can reduce the L1 cache miss rate by 3.36%.