Highly accurate and efficient evaluation of randomising set index functions
Journal of Systems Architecture: the EUROMICRO Journal
Tolerating Late Memory Traps in Dynamically Scheduled Processors
IEEE Transactions on Computers
Moving Address Translation Closer to Memory in Distributed Shared-Memory Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
SBCCI '05 Proceedings of the 18th annual symposium on Integrated circuits and system design
CODES+ISSS '05 Proceedings of the 3rd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
GLSVLSI '06 Proceedings of the 16th ACM Great Lakes symposium on VLSI
A low-cost memory remapping scheme for address bus protection
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Heterogeneously tagged caches for low-power embedded systems with virtual memory support
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Direct address translation for virtual memory in energy-efficient embedded systems
ACM Transactions on Embedded Computing Systems (TECS)
A low-cost memory remapping scheme for address bus protection
Journal of Parallel and Distributed Computing
Enigma: architectural and operating system support for reducing the impact of address translation
Proceedings of the 24th ACM International Conference on Supercomputing
Hi-index | 0.01 |
Abstract: Currently cache hierarchies are indexed in parallel with a TLB but their tags are part of the physical address so that the memory hierarchy is physically addressed. This design faces problems as more concurrency is exploited in the processor core and as the memory demand of emerging applications is growing fast. The traditional TLB does not scale well inside the processor core and its hit rate can be poor for data-intensive applications or scientific applications without much locality. At the same time, given current trends towards computing in memory and in communication interfaces, virtual addresses are needed not just inside the processor but throughout the memory hierarchy. These observations have prompted us to revisit the problem of moving virtual address translation away from the processor.This paper introduces new ideas to enable the use of virtual addresses throughout the memory hierarchy. The major idea is the replacement of the TLB with a small Synonym Lookaside Buffer (SLB), which scales well because its size depends on the number of synonyms, and not on the size of the application or of the physical memory. We also characterize synonym usage, evaluate the amount of cache and SLB flushing due to remapping of addresses, and compare the miss rate of various virtual/physical cache organizations for several application domains. These evaluations show that virtually-addressed memory hierarchies overall have better performance behavior than physically-addressed memory hierarchies. Finally, we also show how virtually-addressed memory hierarchies facilitate natural, scalable multiprocessor extensions, as well as computing-in-memory in the context of general-purpose computers.