Options for dynamic address translation in COMAs

Authors:
Xiaogang Qiu;Michel Dubois
Affiliations:
Department of Electrical Engineering - Systems, University of Southern California, Los Angeles, CA;Department of Electrical Engineering - Systems, University of Southern California, Los Angeles, CA
Venue:
Proceedings of the 25th annual international symposium on Computer architecture
Year:
1998

Citing 28
Cited 9

An in-cache address translation mechanism

ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Coherency for multiprocessor virtual address caches

ASPLOS II Proceedings of the second international conference on Architectual support for programming languages and operating systems
Organization and performance of a two-level virtual-real cache hierarchy

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Performance evaluation of memory consistency models for shared-memory multiprocessors

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
A simulation based study of TLB performance

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Tradeoffs in supporting two page sizes

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
DDM: A Cache-Only Memory Architecture

Computer
Architecture support for single address space operating systems

ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Design tradeoffs for software-managed TLBs

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Architectural support for translation table management in large address space machines

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
The PowerPC architecture: a specification for a new family of RISC processors

The PowerPC architecture: a specification for a new family of RISC processors
The Stanford FLASH multiprocessor

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Surpassing the TLB performance of superpages with less operating system support

ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Sharing and protection in a single-address-space operating system

ACM Transactions on Computer Systems (TOCS) - Special issue on computer architecture
Performance of the VAX-11/780 translation buffer: simulation and measurement

ACM Transactions on Computer Systems (TOCS)
COMA-F: a non-hierarchical cache only memory architecture

COMA-F: a non-hierarchical cache only memory architecture
The SPLASH-2 programs: characterization and methodological considerations

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Reducing TLB and memory overhead using online superpage promotion

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
CAT—caching address tags: a technique for reducing area cost of on-chip caches

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
High-bandwidth address translation for multiple-issue processors

ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Compiler-directed page coloring for multiprocessors

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Analysis of Cache Performance for Operating Systems and Multiprogramming

Analysis of Cache Performance for Operating Systems and Multiprogramming
The MIPS R10000 Superscalar Microprocessor

IEEE Micro
UltraSparc I: A Four-Issue Processor Supporting Multimedia

IEEE Micro
Virtual-Address Caches Part 1: Problems and Solutions in Uniprocessors

IEEE Micro
Virtual-Address Caches, Part 2: Multiprocessor Issues

IEEE Micro
Hardware Versus Software Implementation of COMA

ICPP '97 Proceedings of the international Conference on Parallel Processing
Software-Managed Address Translation

HPCA '97 Proceedings of the 3rd IEEE Symposium on High-Performance Computer Architecture

Tolerating late memory traps in ILP processors

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Recency-based TLB preloading

Proceedings of the 27th annual international symposium on Computer architecture
Virtual memory on data diffusion architectures

Parallel Computing
Tolerating Late Memory Traps in Dynamically Scheduled Processors

IEEE Transactions on Computers
Moving Address Translation Closer to Memory in Distributed Shared-Memory Multiprocessors

IEEE Transactions on Parallel and Distributed Systems
Inter-core cooperative TLB for chip multiprocessors

Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Enigma: architectural and operating system support for reducing the impact of address translation

Proceedings of the 24th ACM International Conference on Supercomputing
Synergistic TLBs for High Performance Address Translation in Chip Multiprocessors

MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
TLB Improvements for Chip Multiprocessors: Inter-Core Cooperative Prefetchers and Shared Last-Level TLBs

ACM Transactions on Architecture and Code Optimization (TACO)

Quantified Score

Hi-index	0.00

Visualization

Abstract

In modern processors, the dynamic translation of virtual addresses to support virtual memory is done before or in parallel with the first-level cache access. As processor technology improves at a rapid pace and the working sets of new applications grow insatiably the latency and bandwidth demands on the TLB (Translation Lookaside Buffer) are getting more and more difficult to meet. The situation is worse in multiprocessor systems, which run larger applications and are plagued by the TLB consistency problem.We evaluate and compare five options for virtual address translation in the context of COMAs (Cache Only Memory Architectures). The dynamic address translation mechanism can be located after the cache access provided the cache is virtual. In a particular design, which we call V-COMA for Virtual COMA, the physical address concept and the traditional TLB are eliminated. While still supporting virtual memory, V-COMA reduces the address translation overhead to a minimum.V-COMA scales well and works better in systems with large number of processors. As a machine running on virtual addresses, V-COMA provides a simple and consistent hardware model to the operating system and the compiler, in which further optimization opportunities are possible.