Orion: a power-performance simulator for interconnection networks
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Distance Associativity for High-Performance Energy-Efficient Non-Uniform Cache Architectures
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Victim Replication: Maximizing Capacity while Hiding Wire Delay in Tiled Chip Multiprocessors
Proceedings of the 32nd annual international symposium on Computer Architecture
A NUCA substrate for flexible CMP cache sharing
Proceedings of the 19th annual international conference on Supercomputing
Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
Cooperative Caching for Chip Multiprocessors
Proceedings of the 33rd annual international symposium on Computer Architecture
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
ASR: Adaptive Selective Replication for CMP Caches
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Cooperative caching: using remote client memory to improve file system performance
OSDI '94 Proceedings of the 1st USENIX conference on Operating Systems Design and Implementation
Cooperative cache partitioning for chip multiprocessors
Proceedings of the 21st annual international conference on Supercomputing
An Adaptive Shared/Private NUCA Cache Partitioning Scheme for Chip Multiprocessors
HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
Adaptive set pinning: managing shared caches in chip multiprocessors
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Larrabee: a many-core x86 architecture for visual computing
ACM SIGGRAPH 2008 papers
Power/Performance/Thermal Design-Space Exploration for Multicore Architectures
IEEE Transactions on Parallel and Distributed Systems
Distributed cooperative caching
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Proceedings of the 37th annual international symposium on Computer architecture
Hi-index | 0.00 |
Current trends in CMPs indicate that the core count will increase in the near future. One of the main performance limiters of these forthcoming microarchitectures is the latency and high-demand of the on-chip network and the off-chip memory communication. To optimize the usage of on-chip memory space and reduce off-chip traffic several techniques have proposed to use the N-chance forwarding mechanism, a solution for distributing unused cache space in chip multiprocessors. This technique, however, can lead in some cases to extra unnecessary network traffic or inefficient cache allocation. This paper presents two alternative power-efficient spilling methods to improve the efficiency of the N-chance forwarding mechanism. Compared to traditional Spilling, our Distance-Aware Spilling technique provides an energy efficiency improvement (MIPS3/W) of 16% on average, and a reduction of the network usage of 14% in a ring configuration while increasing performance 6%. Our Selective Spilling technique is able to avoid most of the unnecessary reallocations and it doubles the reuse of spilled blocks, reducing network traffic by an average of 22%. A combination of both techniques allows to reduce the network usage by 30% on average without degrading performance, allowing a 9% increase of the energy efficiency.