The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Route packets, not wires: on-chip inteconnection networks
Proceedings of the 38th annual Design Automation Conference
Computer Networks
SPEComp: A New Benchmark Suite for Measuring Parallel Computer Performance
WOMPAT '01 Proceedings of the International Workshop on OpenMP Applications and Tools: OpenMP Shared Memory Parallel Programming
Orion: a power-performance simulator for interconnection networks
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
Interconnect-Aware Coherence Protocols for Chip Multiprocessors
Proceedings of the 33rd annual international symposium on Computer Architecture
Coherence Ordering for Ring-based Chip Multiprocessors
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Thermal sensor allocation and placement for reconfigurable systems
Proceedings of the 2006 IEEE/ACM international conference on Computer-aided design
Proximity-aware directory-based coherence for multi-core processor architectures
Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures
Optimizing NUCA Organizations and Wiring Alternatives for Large Caches with CACTI 6.0
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Uncorq: Unconstrained Snoop Request Delivery in Embedded-Ring Multiprocessors
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Proximity coherence for chip multiprocessors
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
A workload-adaptive and reconfigurable bus architecture for multicore processors
International Journal of Reconfigurable Computing
Hi-index | 0.00 |
On chip many-core systems, evolving from prior multi-processor systems, are considered as a promising solution to the performance scalability and power consumption problems. The long communication distance between the traditional multi-processors makes directory-based cache coherence protocols better solutions compared to bus-based snooping protocols even with the overheads from indirections. However, much smaller distances between the CMP cores enhance the reachability of buses, revitalizing the applicability of snooping protocols for cache-to-cache transfers. In this work, we propose a hybrid NoC design to provide optimized support for cache coherency. In our design, on-chip links can be dynamically configured as either point-to-point links between NoC nodes or short buses to facilitate localized snooping. By taking advantage of the best of both worlds, bus-based snooping coherency and NoC-based directory coherency, our approach brings both power and performance benefits.