An instruction cache design for use with a delayed branch
Proceedings of the fourth MIT conference on Advanced research in VLSI
PIPE: a VLSI decoupled architecture
ISCA '85 Proceedings of the 12th annual international symposium on Computer architecture
ACM Computing Surveys (CSUR)
Hardware/software tradeoffs for increased performance
ASPLOS I Proceedings of the first international symposium on Architectural support for programming languages and operating systems
ASPLOS I Proceedings of the first international symposium on Architectural support for programming languages and operating systems
A study of branch prediction strategies
ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture
The effect of instruction fetch strategies upon the performance of pipelined instruction units
ISCA '77 Proceedings of the 4th annual symposium on Computer architecture
Implementation of the PIPE Processor
Computer - Special issue on experimental research in computer architecture
Classification and performance evaluation of instruction buffering techniques
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
On reconfigurable on-chip data caches
MICRO 24 Proceedings of the 24th annual international symposium on Microarchitecture
SPIRE: streaming processing with instructions release element
ACM SIGARCH Computer Architecture News
Memory latency effects in decoupled architectures with a single data memory module
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Column-associative caches: a technique for reducing the miss rate of direct-mapped caches
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Optimal allocation of on-chip memory for multiple-API operating systems
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Instruction fetching: coping with code bloat
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
SPAID: software prefetching in pointer- and call-intensive environments
Proceedings of the 28th annual international symposium on Microarchitecture
Wrong-path instruction prefetching
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
An evaluation of functional unit lengths for single-chip processors
MICRO 23 Proceedings of the 23rd annual workshop and symposium on Microprogramming and microarchitecture
25 years of the international symposia on Computer architecture (selected papers)
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Memory Latency Effects in Decoupled Architectures
IEEE Transactions on Computers
Hi-index | 0.00 |
Most current single-chip processors employ an on-chip instruction cache to improve performance. A miss in this instruction cache will cause an external memory reference which must compete with data references for access to the external memory, thus affecting the overall performance of the processor. One common way to reduce the number of off-chip instruction requests is to increase the size of the on-chip cache. An alternative approach is presented in this paper, in which a combination of an instruction cache, instruction queue and instruction queue buffer is used to achieve the same effect with a much smaller instruction cache size. Such an approach is significant for emerging technologies where high circuit densities are initially difficult to achieve yet a high level of performance is desired, or for more mature technologies where chip area can be used to provide more functionality. The viability of this approach is demonstrated by its implementation in an existing single-chip processor.