Reducing TLB power requirements
ISLPED '97 Proceedings of the 1997 international symposium on Low power electronics and design
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
TLB and snoop energy-reduction using virtual caches in low-power chip-multiprocessors
Proceedings of the 2002 international symposium on Low power electronics and design
Towards Virtually-Addressed Memory Hierarchies
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
A Banked-Promotion TLB for High Performance and Low Power
ICCD '01 Proceedings of the International Conference on Computer Design: VLSI in Computers & Processors
Compiler-Directed Code Restructuring for Reducing Data TLB Energy
CODES+ISSS '04 Proceedings of the international conference on Hardware/Software Codesign and System Synthesis: 2004
Hi-index | 0.00 |
In this paper we present an application-driven address translation scheme for low-power and real-time embedded processors with virtual memory support. The power inefficiency and nondeterministic execution times of address-translation mechanisms have been major barriers in adopting and utilizing the benefits of virtual memory in embedded processors with low-power and real-time constraints. To address this problem, we propose a novel, Customizable Translation Table (CTT) organization, where application knowledge regarding the virtual memory footprint is used in order to eliminate conflicts in the hardware translation buffer and, thus, achieve tag-free address translation lookups. The set of virtual pages is partitioned into groups, such that for each group only a few of the least significant bits are used as an index to obtain the physical page number. We outline an efficient compile-time algorithm for identifying these groups and allocate their translation entries optimally into the CTT. The proposed methodology relies on the combined efforts of compiler, operating system, and hardware architecture to achieve a significant power reduction. The experiments that we have performed on a set of embedded applications show power reductions in the range of 55% to 80% compared to a general- purpose Translation Lookaside Buffer (TLB).