Regional cache organization for NoC based many-core processors

Authors:
John M. Ye;Man Cao;Zening Qu;Tianzhou Chen
Affiliations:
College of Computer Science, Zhejiang University, Hangzhou, 310027, PR China;College of Computer Science, Zhejiang University, Hangzhou, 310027, PR China;College of Computer Science, Zhejiang University, Hangzhou, 310027, PR China;College of Computer Science, Zhejiang University, Hangzhou, 310027, PR China
Venue:
Journal of Computer and System Sciences
Year:
2013

Citing 12
Cited 0

The SPLASH-2 programs: characterization and methodological considerations

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches

Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Virtual-Address Caches Part 1: Problems and Solutions in Uniprocessors

IEEE Micro
Virtual-Address Caches, Part 2: Multiprocessor Issues

IEEE Micro
Microarchitecture Optimizations for Exploiting Memory-Level Parallelism

Proceedings of the 31st annual international symposium on Computer architecture
Chip multiprocessing and the cell broadband engine

Proceedings of the 3rd conference on Computing frontiers
Fair Queuing Memory Systems

Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Exploration of distributed shared memory architectures for NoC-based multiprocessors

Journal of Systems Architecture: the EUROMICRO Journal
A Quantitative Study of the On-Chip Network and Memory Hierarchy Design for Many-Core Processor

ICPADS '08 Proceedings of the 2008 14th IEEE International Conference on Parallel and Distributed Systems
An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness

Proceedings of the 36th annual international symposium on Computer architecture
Achieving predictable performance through better memory controller placement in many-core CMPs

Proceedings of the 36th annual international symposium on Computer architecture
Memory management thread for heap allocation intensive sequential applications

Proceedings of the 10th workshop on MEmory performance: DEaling with Applications, systems and architecture

Quantified Score

Hi-index	0.00

Visualization

Abstract

As the number of Processing Elements (PEs) on a single chip keeps growing, we are now facing with slower memory references due to longer wire delay, intenser on-chip resource contention and higher network traffic congestion. Network on Chip (NoC) is now considered as a promising paradigm of inter-core connection for future many-core processors. In this paper, we examined how the regional cache organizations drastically reduce the average network latency, and proposed a regional cache architecture with Delegate Memory Management Units (D-MMUs) for NoC based processors. Experiments showed that the L2 cache access latency is largely determined by its organization and inter-connection paradigm with PEs in the NoC, and that the regional organization is essentially important for better NoC cache performance.