Improving instruction cache behavior by reducing cache pollution
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Effectiveness of trace sampling for performance debugging tools
SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Threaded prefetching: an adaptive instruction prefetch mechanism
Microprocessing and Microprogramming
Evaluating stream buffers as a secondary cache replacement
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Trace cache: a low latency approach to high bandwidth instruction fetching
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Architectural exploration and optimization of local memory in embedded systems
ISSS '97 Proceedings of the 10th international symposium on System synthesis
Prefetching using Markov predictors
Proceedings of the 24th annual international symposium on Computer architecture
A Performance Study of Instruction Cache Prefetching Methods
IEEE Transactions on Computers
25 years of the international symposia on Computer architecture (selected papers)
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Application-specific memory management for embedded systems using software-controlled caches
Proceedings of the 37th Annual Design Automation Conference
Performance analysis using the MIPS R10000 performance counters
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
ARM System-on-Chip Architecture
ARM System-on-Chip Architecture
An optimal memory allocation scheme for scratch-pad-based embedded systems
ACM Transactions on Embedded Computing Systems (TECS)
Software-assisted cache replacement mechanisms for embedded systems
Proceedings of the 2001 IEEE/ACM international conference on Computer-aided design
Scratchpad memory: design alternative for cache on-chip memory in embedded systems
Proceedings of the tenth international symposium on Hardware/software codesign
Instruction prefetching using branch prediction information
ICCD '97 Proceedings of the 1997 International Conference on Computer Design (ICCD '97)
Compiler-decided dynamic memory allocation for scratch-pad based embedded systems
Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Cluster miss prediction for instruction caches in embedded networking applications
Proceedings of the 14th ACM Great Lakes symposium on VLSI
Support for software performance tuning on network processors
IEEE Network: The Magazine of Global Internetworking
An effective instruction cache prefetch policy by exploiting cache history information
EUC'05 Proceedings of the 2005 international conference on Embedded and Ubiquitous Computing
Hi-index | 0.00 |
Soft CPU cores are often used in embedded systems, yet they limit opportunities to improve cache performance to hardware assistance outside the CPU core. Instruction prefetching is commonly used, but the popular Prefetch On Miss (POM) technique is less helpful when the instruction flow does not follow a sequential execution order, which is often the case in real-time networking applications. Cluster Miss Prediction (CMP) can help in those worst case situations when cache misses do not follow a sequential order, and can be combined with POM to provide an effective technique for real-time networking applications on embedded systems. The benefits of the CMP+POM technique are illustrated in the context of an industrial embedded networking application using different cache configurations.