An effective on-chip preloading scheme to reduce data access penalty
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Compiler-based prefetching for recursive data structures
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Exploiting hardware performance counters with flow and context sensitive profiling
Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Analytical energy dissipation models for low-power caches
ISLPED '97 Proceedings of the 1997 international symposium on Low power electronics and design
The predictability of data values
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
The SimpleScalar tool set, version 2.0
ACM SIGARCH Computer Architecture News
A 160-MHz, 32-b, 0.5-W CMOS RISC microprocessor
Digital Technical Journal
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Reconfigurable caches and their application to media processing
Proceedings of the 27th annual international symposium on Computer architecture
Energy-Efficiency of VLSI Caches: A Comparative Study
VLSID '97 Proceedings of the Tenth International Conference on VLSI Design: VLSI in Multimedia Applications
Energy Benefits of a Configurable Line Size Cache for Embedded Systems
ISVLSI '03 Proceedings of the IEEE Computer Society Annual Symposium on VLSI (ISVLSI'03)
A highly configurable cache architecture for embedded systems
Proceedings of the 30th annual international symposium on Computer architecture
The Minimax Cache: An Energy-Efficient Framework for Media Processors
HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
An energy efficient cache memory architecture for embedded systems
Proceedings of the 2004 ACM symposium on Applied computing
Using a Victim Buffer in an Application-Specific Memory Hierarchy
Proceedings of the conference on Design, automation and test in Europe - Volume 1
Automatic Tuning of Two-Level Caches to Embedded Applications
Proceedings of the conference on Design, automation and test in Europe - Volume 1
MiBench: A free, commercially representative embedded benchmark suite
WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
Configurable range memory for effective data reuse on programmable accelerators
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Hi-index | 0.00 |
This paper shows that even very small reconfigurable data caches, when split to serve data streams exhibiting temporal and spatial localities, can improve performance of embedded applications without consuming excessive silicon real estate or power. It also shows that neither higher set-associativities nor large block sizes are necessary with reconfigurable split cache organizations. We use benchmark programs from the MiBench suite to show that our cache organization outperforms an 8k unified data cache in terms of miss rates, access times, energy consumption and silicon area. Finally we show how the saved area can be utilized for supporting techniques for improving performance of embedded systems. Our design enables the cache to be divided into multiple partitions that can be used for different processor activities other than conventional caching. In this paper we have evaluated one of those options to support "prefetching".