Light NUCA: a proposal for bridging the inter-cache latency gap

Authors:
Darío Suárez;Teresa Monreal;Fernando Vallejo;Ramón Beivide;Víctor Viñals
Affiliations:
Universidad de Zaragoza, Spain and HiPEAC Network of Excellence;Universidad de Zaragoza, Spain and HiPEAC Network of Excellence;Universidad de Cantabria, Spain and HiPEAC Network of Excellence;Universidad de Cantabria, Spain and HiPEAC Network of Excellence;Universidad de Zaragoza, Spain and HiPEAC Network of Excellence
Venue:
Proceedings of the Conference on Design, Automation and Test in Europe
Year:
2009

Citing 13
Cited 1

Piranha: a scalable architecture based on single-chip multiprocessing

Proceedings of the 27th annual international symposium on Computer architecture
Memory hierarchy reconfiguration for energy and performance in general-purpose processor architectures

Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches

Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Orion: a power-performance simulator for interconnection networks

Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
A Delay Model and Speculative Architecture for Pipelined Routers

HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Power-driven Design of Router Microarchitectures in On-chip Networks

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Distance Associativity for High-Performance Energy-Efficient Non-Uniform Cache Architectures

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
TLC: Transmission Line Caches

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Principles and Practices of Interconnection Networks

Principles and Practices of Interconnection Networks
Interconnect design considerations for large NUCA caches

Proceedings of the 34th annual international symposium on Computer architecture
Analysis of redundancy and application balance in the SPEC CPU2006 benchmark suite

Proceedings of the 34th annual international symposium on Computer architecture
A Domain-Specific On-Chip Network Design for Large Scale Cache Systems

HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
IBM POWER6 SRAM arrays

IBM Journal of Research and Development

LP-NUCA: networks-in-cache for high-performance low-power embedded processors

IEEE Transactions on Very Large Scale Integration (VLSI) Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

To deal with the "memory wall" problem, microprocessors include large secondary on-chip caches. But as these caches enlarge, they originate a new latency gap between them and fast L1 caches (inter-cache latency gap). Recently, NonUniform Cache Architectures (NUCAs) have been proposed to sustain the size growth trend of secondary caches that is threatened by wire-delay problems. NUCAs are size-oriented, and they were not conceived to close the inter-cache latency gap. To tackle this problem, we propose Light NUCAs (L-NUCAs) leveraging on-chip wire density to interconnect small tiles through specialized networks, which convey packets with distributed and dynamic routing. Our design reduces the tile delay (cache access plus one-hop routing) to a single processor cycle and places cache lines at a finer granularity than conventional caches, reducing cache latency. Our evaluations show that in general, an L-NUCA improves simultaneously performance, energy, and area when integrated into both conventional or D-NUCA hierarchies.