Memory bandwidth limitations of future microprocessors
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Reconfigurable caches and their application to media processing
Proceedings of the 27th annual international symposium on Computer architecture
Drowsy caches: simple techniques for reducing leakage power
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Exploring the Design Space of Future CMPs
Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques
Characterizing and Predicting Program Behavior and its Variability
Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques
Reflections on the memory wall
Proceedings of the 1st conference on Computing frontiers
An empirical study of smoothing techniques for language modeling
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Dynamic tracking of page miss ratio curve for memory management
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Fair Cache Sharing and Partitioning in a Chip Multiprocessor Architecture
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
The PARSEC benchmark suite: characterization and architectural implications
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Predicting Program Behavior Based On Objective Function Minimization
IISWC '07 Proceedings of the 2007 IEEE 10th International Symposium on Workload Characterization
Hybrid cache architecture with disparate memory technologies
Proceedings of the 36th annual international symposium on Computer architecture
Scaling the bandwidth wall: challenges in and avenues for CMP scaling
Proceedings of the 36th annual international symposium on Computer architecture
Exploration of 3D stacked L2 cache design for high performance and efficient thermal control
Proceedings of the 14th ACM/IEEE international symposium on Low power electronics and design
Off-chip memory bandwidth minimization through cache partitioning for multi-core platforms
Proceedings of the 47th Design Automation Conference
IISWC '10 Proceedings of the IEEE International Symposium on Workload Characterization (IISWC'10)
Moguls: a model to explore the memory hierarchy for bandwidth improvements
Proceedings of the 38th annual international symposium on Computer architecture
MorphCache: A Reconfigurable Adaptive Multi-level Cache hierarchy
HPCA '11 Proceedings of the 2011 IEEE 17th International Symposium on High Performance Computer Architecture
Hi-index | 0.00 |
In chip-multiprocessor (CMP) designs, limited memory bandwidth is a potential bottleneck of the system performance. New memory technologies, such as spin-torque-transfer memory (STT-RAM), resistive memory (RRAM), and embedded DRAM (eDRAM), are promising on-chip memory solutions for CMPs. In this paper, we propose a bandwidth-aware re-configurable cache hierarchy (BARCH) with hybrid memory technologies. BARCH consists of a hybrid cache hierarchy, a reconfiguration mechanism, and a statistical prediction engine. Our hybrid cache hierarchy chooses different memory technologies to configure each level so that the bandwidth provided by the overall hierarchy is optimized. Furthermore, we present a reconfiguration mechanism to dynamically adapt the cache space of each level based on the predicted bandwidth demands of different applications, which is guaranteed by our prediction engine. We evaluate the system performance gain obtained by our method with a set of multithreaded and multiprogrammed applications. Compared to traditional SRAM-based cache designs, our proposed design improves the system throughput by 58% and 14% for multithreaded and multiprogrammed applications, respectively.