Evaluating stream buffers as a secondary cache replacement
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Hitting the memory wall: implications of the obvious
ACM SIGARCH Computer Architecture News
Prefetching using Markov predictors
Proceedings of the 24th annual international symposium on Computer architecture
The predictability of data values
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Dead-block prediction & dead-block correlating prefetchers
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Going the distance for TLB prefetching: an application-driven study
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Timekeeping in the memory system: predicting and optimizing memory behavior
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Effective Hardware-Based Data Prefetching for High-Performance Processors
IEEE Transactions on Computers
Sequential Hardware Prefetching in Shared-Memory Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Differential FCM: Increasing Value Prediction Accuracy by Improving Table Usage Efficiency
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Call graph prefetching for database applications
ACM Transactions on Computer Systems (TOCS)
IEEE Transactions on Computers
Reflections on the memory wall
Proceedings of the 1st conference on Computing frontiers
AC/DC: An Adaptive Data Cache Prefetcher
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Exploring the limits of prefetching
IBM Journal of Research and Development - Electrochemical technology in microelectronics
Pin: building customized program analysis tools with dynamic instrumentation
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Temporal Streaming of Shared Memory
Proceedings of the 32nd annual international symposium on Computer Architecture
Data Cache Prefetching Using a Global History Buffer
HPCA '04 Proceedings of the 10th International Symposium on High Performance Computer Architecture
Bulk Disambiguation of Speculative Threads in Multiprocessors
Proceedings of the 33rd annual international symposium on Computer Architecture
CAVA: Using checkpoint-assisted value prediction to hide L2 misses
ACM Transactions on Architecture and Code Optimization (TACO)
Computer Architecture, Fourth Edition: A Quantitative Approach
Computer Architecture, Fourth Edition: A Quantitative Approach
ACM SIGARCH Computer Architecture News
Fixed and Adaptive Sequential Prefetching in Shared Memory Multiprocessors
ICPP '93 Proceedings of the 1993 International Conference on Parallel Processing - Volume 01
HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
Data prefetching in a cache hierarchy with high bandwidth and capacity
ACM SIGARCH Computer Architecture News
Low-Cost Epoch-Based Correlation Prefetching for Commercial Applications
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Data access history cache and associated data prefetching mechanisms
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Flexible Hardware Acceleration for Instruction-Grain Program Monitoring
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Spatio-temporal memory streaming
Proceedings of the 36th annual international symposium on Computer architecture
Stream chaining: exploiting multiple levels of correlation in data prefetching
Proceedings of the 36th annual international symposium on Computer architecture
Coordinated control of multiple prefetchers in multi-core systems
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Inter-core cooperative TLB for chip multiprocessors
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Improving the Effectiveness of Context-Based Prefetching with Multi-order Analysis
ICPPW '10 Proceedings of the 2010 39th International Conference on Parallel Processing Workshops
Hi-index | 0.00 |
Data prefetching is widely used in high-end computing systems to accelerate data accesses and to bridge the increasing performance gap between processor and memory. Context-based prefetching has become a primary focus of study in recent years due to its general applicability. However, current context-based prefetchers only adopt the context analysis of a single order, which suffers from low prefetching coverage and thus limits the overall prefetching effectiveness. Also, existing approaches usually consider the context of the address stream from a single instruction but not the context of the address stream from all instructions, which further limits the context-based prefetching effectiveness. In this study, we propose a new context-based prefetcher called the Global-aware and Multi-order Context-based (GMC) prefetcher. The GMC prefetcher uses multi-order, local and global context analysis to increase prefetching coverage while maintaining prefetching accuracy. In extensive simulation testing of the SPEC-CPU2006 benchmarks with an enhanced CMP$im simulator, the proposed GMC prefetcher was shown to outperform existing prefetchers and to reduce the data-access latency effectively. The average Instructions Per Cycle (IPC) improvement of SPEC CINT2006 and CFP2006 benchmarks with GMC prefetching was over 55% and 44% respectively.