MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
TLB and snoop energy-reduction using virtual caches in low-power chip-multiprocessors
Proceedings of the 2002 international symposium on Low power electronics and design
Towards Virtually-Addressed Memory Hierarchies
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
A Banked-Promotion TLB for High Performance and Low Power
ICCD '01 Proceedings of the International Conference on Computer Design: VLSI in Computers & Processors
Compiler-Directed Code Restructuring for Reducing Data TLB Energy
CODES+ISSS '04 Proceedings of the international conference on Hardware/Software Codesign and System Synthesis: 2004
Hi-index | 0.00 |
In this paper, we present an arithmetic-based address translation scheme for low-power and real-time embedded processors with virtual memory support. General-purpose virtual memory support comes with its fundamental disadvantages of excessive power consumption and nondeterministic execution times. These disadvantages have been the main reason for not adopting virtual memory and its associated benefits in embedded systems where energy efficiency and real-time operations are major requirements. To address these issues, we propose a novel scheme for application-driven address translation where most of the virtual address translations, which are traditionally performed as lookups in the Translation Lookaside Buffer (TLB), are replaced with fast and energy efficient arithmetic add operations. To achieve this, a program and system-wide information is used to identify sequences of consecutive virtual page numbers, which are mapped to sequences of consecutive physical page frames. For such pairs of virtual and physical page sequences, only the addition of a constant to the virtual page number is needed to produce the physical page frame. The proposed methodology relies on the combined efforts of compiler, operating system, and hardware architecture to achieve a significant power reduction. As the approach fundamentally eliminates conflicts inherent in the hardware translation table, execution time is not only improved but also made predictable during system design time. The experiments that we have performed on a set of embedded applications show power reductions in the range of 80% to 95% compared to a general-purpose Translation Lookaside Buffer (TLB).