Planar-adaptive routing: low-cost adaptive networks for multiprocessors
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Tradeoffs in two-level on-chip caching
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
The Stanford FLASH multiprocessor
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Run-time adaptive cache hierarchy management via reference analysis
Proceedings of the 24th annual international symposium on Computer architecture
Dynamic history-length fitting: a third level of adaptivity for branch prediction
Proceedings of the 25th annual international symposium on Computer architecture
Exploiting spatial locality in data caches using spatial footprints
Proceedings of the 25th annual international symposium on Computer architecture
Adapting cache line size to application behavior
ICS '99 Proceedings of the 13th international conference on Supercomputing
Scalability of the cedar system
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Deadlock-Free Adaptive Routing in Multicomputer Networks Using Virtual Channels
IEEE Transactions on Parallel and Distributed Systems
MINT: A Front End for Efficient Simulation of Shared-Memory Multiprocessors
MASCOTS '94 Proceedings of the Second International Workshop on Modeling, Analysis, and Simulation On Computer and Telecommunication Systems
Software assistance for data caches
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Pursuing the Performance Potential of Dynamic Cache Line Sizes
ICCD '99 Proceedings of the 1999 IEEE International Conference on Computer Design
Distributed Shared Memory Architecture for JUMP-1: A General-Purpose MPP Prototype
ISPAN '96 Proceedings of the 1996 International Symposium on Parallel Architectures, Algorithms and Networks
ABSS v2.0: a SPARC Simulator
Fixed and Adaptive Sequential Prefetching in Shared Memory Multiprocessors
ICPP '93 Proceedings of the 1993 International Conference on Parallel Processing - Volume 01
An Integrated Hardware/Software Data Prefetching Scheme for Shared-Memory Multiprocessors
ICPP '94 Proceedings of the 1994 International Conference on Parallel Processing - Volume 02
Hi-index | 0.00 |
The performance of a traditional cache memory hierarchy can be improved by utilizing mechanisms such as a victim cache or a stream buffer (cache assists). The amount of on-chip memory for cache assist is typically limited for technological reasons. In addition, the cache assist size is limited in order to maintain a fast access time. Performance gains from using a stream buffer or a victim cache, or a combination of the two, varies from program to program as well as within a program. Therefore, given a limited amount of cache assist memory, there is a need and a potential for "adaptivity" of the cache assists i.e., an ability to vary their relative size within the bounds of the cache assist memory size. We propose and study a compiler-driven adaptive cache assist organization and its effect on system performance. Several adaptivity mechanisms are proposed and investigated. The results show that a cache assist that is adaptive at loop level clearly improves the cache memory performance, has low overhead, and can be easily implemented.