Interconnect characteristics of 2.5-D system integration scheme
Proceedings of the 2001 international symposium on Physical design
Lockup-free instruction fetch/prefetch cache organization
ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Die Stacking (3D) Microarchitecture
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Adaptive insertion policies for managing shared caches
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Hybrid cache architecture with disparate memory technologies
Proceedings of the 36th annual international symposium on Computer architecture
PIPP: promotion/insertion pseudo-partitioning of multi-core shared caches
Proceedings of the 36th annual international symposium on Computer architecture
Scaling the bandwidth wall: challenges in and avenues for CMP scaling
Proceedings of the 36th annual international symposium on Computer architecture
Extending the effectiveness of 3D-stacked DRAM caches with an adaptive multi-queue policy
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Thread Cluster Memory Scheduling: Exploiting Differences in Memory Access Behavior
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Efficiently enabling conventional block sizes for very large die-stacked DRAM caches
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Dynamically reconfigurable hybrid cache: an energy-efficient last-level cache design
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
Reducing inter-core cache contention with an adaptive bank mapping policy in DRAM cache
Proceedings of the Ninth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis
Simultaneously optimizing DRAM cache hit latency and miss rate via novel set mapping policies
Proceedings of the 2013 International Conference on Compilers, Architectures and Synthesis for Embedded Systems
Hi-index | 0.00 |
On-chip DRAM caches may alleviate the memory bandwidth problem in future multi-core architectures through reducing off-chip accesses via increased cache capacity. For memory intensive applications, recent research has demonstrated the benefits of introducing high capacity on-chip L4-DRAM as Last-Level-Cache between L3-SRAM and off-chip memory. These multi-core cache hierarchies attempt to exploit the latency benefits of L3-SRAM and capacity benefits of L4-DRAM caches. However, not taking into consideration the cache access patterns of complex applications can cause inter-core DRAM interference and inter-core cache contention. In this paper, we contest to re-architect existing cache hierarchies by proposing a hybrid cache architecture, where the Last-Level-Cache is a combination of SRAM and DRAM caches. We propose an adaptive DRAM placement policy in response to the diverse requirements of complex applications with different cache access behaviors. It reduces inter-core DRAM interference and inter-core cache contention in SRAM/DRAM-based hybrid cache architectures: increasing the harmonic mean instruction-per-cycle throughput by 23.3% (max. 56%) and 13.3% (max. 35.1%) compared to state-of-the-art.