An effective on-chip preloading scheme to reduce data access penalty
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Evaluating stream buffers as a secondary cache replacement
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
A performance study of software and hardware data prefetching schemes
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Hardware-only stream prefetching and dynamic access ordering
Proceedings of the 14th international conference on Supercomputing
APEX: access pattern based memory architecture exploration
Proceedings of the 14th international symposium on Systems synthesis
An efficient profile-analysis framework for data-layout optimizations
POPL '02 Proceedings of the 29th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Custom Memory Management Methodology: Exploration of Memory Organisation for Embedded Multimedia System Design
Effective Hardware-Based Data Prefetching for High-Performance Processors
IEEE Transactions on Computers
Multiprocessor system-on-chip data reuse analysis for exploring customized memory hierarchies
Proceedings of the 43rd annual Design Automation Conference
External memory page remapping for embedded multimedia systems
Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Memory organization with multi-pattern parallel accesses
Proceedings of the conference on Design, automation and test in Europe
Holistic design and caching in mobile computing
CODES+ISSS '08 Proceedings of the 6th IEEE/ACM/IFIP international conference on Hardware/Software codesign and system synthesis
Automatic memory partitioning: increasing memory parallelism via data structure partitioning
CODES/ISSS '10 Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Using memory profile analysis for automatic synthesis of pointers code
ACM Transactions on Embedded Computing Systems (TECS)
Hi-index | 0.00 |
Memory system is a major performance and power bottleneck in embedded systems especially for multimedia applications. Most multimedia applications access stream type of data structures with regular access patterns. It is observed that conventional caches behave poorly for stream-type data structure. Therefore, prediction-based prefetching techniques have been extensively researched to exploit the regular access patterns. Prefetching, however, may pollute the cache if the prediction is not accurate and needs extra hardware prediction logic. To overcome these problems, we propose a novel hardware prefetching technique that is assisted by static analysis of data access pattern with stream caches. With the proposed stream cache architecture, we could achieve significant performance improvement compared with the conventional cache architecture.