Focusing processor policies via critical-path prediction
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Slack: maximizing performance under technological constraints
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
ACM Transactions on Embedded Computing Systems (TECS)
Quantifying Instruction Criticality
Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Integrating Adaptive On-Chip Storage Structures for Reduced Dynamic Power
Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Compiler managed micro-cache bypassing for high performance EPIC processors
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Using Interaction Costs for Microarchitectural Bottleneck Analysis
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Process variation aware issue queue design
Proceedings of the conference on Design, automation and test in Europe
The performance of pollution control victim cache for embedded systems
Proceedings of the 21st annual symposium on Integrated circuits and system design
Hot-and-Cold: using criticality in the design of energy-efficient caches
PACS'03 Proceedings of the Third international conference on Power - Aware Computer Systems
Improving memory scheduling via processor-side load criticality information
Proceedings of the 40th Annual International Symposium on Computer Architecture
Hi-index | 0.00 |
Data cache performance is critical to overall processor performance as the latency gap between CPU core and main memory increases. Studies have shown that some loads have latency demands that allow them to be serviced from slower portions of memory, thus allowing more critical data to be kept in higher levels of the cache. We provide a strategy for identifying this latency-tolerant data at runtime and, using simple heuristics, keep it out of the main cache and place it instead in a small, parallel, associative buffer. Using such a "Non-Critical Buffer" dramatically improves the hit rate for more critical data, and leads to a performance improvement comparable to or better than other traditional cache improvement schemes. IPC improvements of over 4% are seen for some benchmarks.