Contrasting characteristics and cache performance of technical and multi-user commercial workloads
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Wrong-path instruction prefetching
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
On high-bandwidth data cache design for multi-issue processors
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
A Performance Study of Instruction Cache Prefetching Methods
IEEE Transactions on Computers
CPU Cache Prefetching: Timing Evaluation of Hardware Implementations
IEEE Transactions on Computers
Computer architecture (2nd ed.): a quantitative approach
Computer architecture (2nd ed.): a quantitative approach
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Improving prediction for procedure returns with return-address-stack repair mechanisms
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Prefetching Using Markov Predictors
IEEE Transactions on Computers - Special issue on cache memory and related problems
Fetch directed instruction prefetching
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Circuits for wide-window superscalar processors
Proceedings of the 27th annual international symposium on Computer architecture
ACM Computing Surveys (CSUR)
DBMSs on a Modern Processor: Where Does Time Go?
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Branch History Guided Instruction Prefetching
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
A low-cost memory architecture with NAND XIP for mobile embedded systems
Proceedings of the 1st IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Exploring the limits of prefetching
IBM Journal of Research and Development - Electrochemical technology in microelectronics
An effective instruction cache prefetch policy by exploiting cache history information
EUC'05 Proceedings of the 2005 international conference on Embedded and Ubiquitous Computing
Cost Minimization with HPDFG and Data Mining for Heterogeneous DSP
Journal of Signal Processing Systems
Reducing Power and Energy Overhead in Instruction Prefetching for Embedded Processor Systems
International Journal of Handheld Computing Research
Hi-index | 0.00 |
The increasing gap in performance between processors and main memory has made effective instructions prefetching techniques more important than ever. A major deficiency of existing prefetching methods is that most of them require an extra port to I-cache. A recent study by [19] shows that this factor alone explains why most modern microprocessors do not use such hardware-based I-cache prefetch schemes. The contribution of this paper is two-fold. First we present a method that does not require an extra port to I-cache. Second, the performance improvement for our method is greater than the best competing method [23] even disregarding the improvement from not having an extra port.The three key features of our method that prevent the above deficiencies are as follows. First, too-late prefetching is prevented by correlating misses to dynamically preceding instructions. For example, if the I-cache miss latency is 12 cycles, then the instruction that was fetched 12 cycles prior to the miss is used as the prefetch trigger. Second, the miss history table is kept to a reasonable size by grouping contiguous cache misses together and associated them with one preceding instruction, and therefore, one table entry. Third, the extra I-cache port is avoided through efficient prefetch filtering methods. Experiments show that for our benchmarks, chosen for their poor I-cache performance, an average improvement of 9.2% in runtime is achieved versus the BHGP methods [23], while the hardware cost is also reduced. The improvement will be greater if the runtime impact of avoiding an extra port is considered.