UTLB: a mechanism for address translation on network interfaces

Authors:
Yuqun Chen;Angelos Bilas;Stefanos N. Damianakis;Cezary Dubnicki;Kai Li
Affiliations:
Department of Computer Science, Princeton University, Princeton, NJ;Department of Computer Science, Princeton University, Princeton, NJ;Department of Computer Science, Princeton University, Princeton, NJ;Department of Computer Science, Princeton University, Princeton, NJ;Department of Computer Science, Princeton University, Princeton, NJ
Venue:
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Year:
1998

Citing 36
Cited 18

Cache and memory hierarchy design: a performance-directed approach

Cache and memory hierarchy design: a performance-directed approach
Active messages: a mechanism for integrated communication and computation

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
A tightly-coupled processor-network interface

ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
The Peregrine high-performance RPC system

Software—Practice & Experience
Fbufs: a high-bandwidth cross-domain transfer facility

SOSP '93 Proceedings of the fourteenth ACM symposium on Operating systems principles
Experiences with a high-speed network adaptor: a software perspective

SIGCOMM '94 Proceedings of the conference on Communications architectures, protocols and applications
Virtual memory mapped network interface for the SHRIMP multicomputer

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
The Stanford FLASH multiprocessor

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Tempest and typhoon: user-level shared memory

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
VAXcluster: a closely-coupled distributed system

ACM Transactions on Computer Systems (TOCS)
A study of integrated prefetching and caching strategies

Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
U-Net: a user-level network interface for parallel and distributed computing

SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
High performance messaging on workstations: Illinois fast messages (FM) for Myrinet

Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Understanding application performance on shared virtual memory systems

ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Implementation and performance of integrated application-controlled file caching, prefetching, and disk scheduling

ACM Transactions on Computer Systems (TOCS)
Synchronization and communication in the T3E multiprocessor

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Performance evaluation of two home-based lazy release consistency protocols for shared virtual memory systems

OSDI '96 Proceedings of the second USENIX symposium on Operating systems design and implementation
An implementation of the Hamlyn sender-managed interface architecture

OSDI '96 Proceedings of the second USENIX symposium on Operating systems design and implementation
Effects of buffering semantics on I/O performance

OSDI '96 Proceedings of the second USENIX symposium on Operating systems design and implementation
Design choices in the SHRIMP system: an empirical study

Proceedings of the 25th annual international symposium on Computer architecture
Computer architecture (2nd ed.): a quantitative approach

Computer architecture (2nd ed.): a quantitative approach
Performance monitoring in a Myrinet-connected SHRIMP cluster

SPDT '98 Proceedings of the SIGMETRICS symposium on Parallel and distributed tools
Performing remote operations efficiently on a local computer network

Communications of the ACM
Myrinet: A Gigabit-per-Second Local Area Network

IEEE Micro
Shrimp Project Update: Myrinet Communication

IEEE Micro
Software Support for Virtual Memory-Mapped Communication

IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Design and Implementation of Virtual Memory-Mapped Communication on Myrinet

IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Overview of memory channel network for PCI

COMPCON '96 Proceedings of the 41st IEEE International Computer Conference
Protected, user-level DMA for the SHRIMP network interface

HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
Telegraphos: High-Performance Networking for Parallel Processing on Workstation Clusters

HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
Address Translation Mechanisms In Network Interfaces

HPCA '98 Proceedings of the 4th International Symposium on High-Performance Computer Architecture
Efficiently Adapting to Sharing Patterns in Software DSMs

HPCA '98 Proceedings of the 4th International Symposium on High-Performance Computer Architecture
Home-Based SVM Protocols for SMP Clusters: Design and Performance

HPCA '98 Proceedings of the 4th International Symposium on High-Performance Computer Architecture
Incorporating Memory Management into User-Level Network Interfaces

Incorporating Memory Management into User-Level Network Interfaces
Aspects of cache memory and instruction buffer performance

Aspects of cache memory and instruction buffer performance
Efficient connection-oriented communication on high-performance networks

Efficient connection-oriented communication on high-performance networks

Fast cluster failover using virtual memory-mapped communication

ICS '99 Proceedings of the 13th international conference on Supercomputing
Load balancing for multi-projector rendering systems

HWWS '99 Proceedings of the ACM SIGGRAPH/EUROGRAPHICS workshop on Graphics hardware
SPINE: a safe programmable and integrated network environment

Proceedings of the 8th ACM SIGOPS European workshop on Support for composing distributed applications
Hybrid sort-first and sort-last parallel rendering with a cluster of PCs

HWWS '00 Proceedings of the ACM SIGGRAPH/EUROGRAPHICS workshop on Graphics hardware
Automatic alignment of high-resolution multi-projector display using an un-calibrated camera

Proceedings of the conference on Visualization '00
User-space communication: a quantitative study

SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Experiences with VI communication for database storage

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Building and Using A Scalable Display Wall System

IEEE Computer Graphics and Applications
miNI: reducing network interface memory requirements with dynamic handle lookup

ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Cluster communication protocols for parallel-programming systems

ACM Transactions on Computer Systems (TOCS)
VI-Attached Database Storage

IEEE Transactions on Parallel and Distributed Systems
Impact of Page Size on Communication Performance

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 9 - Volume 10
Design Trade-Offs for User-Level I/O Architectures

IEEE Transactions on Computers
Efficient remote block-level I/O over an RDMA-capable NIC

Proceedings of the 20th annual international conference on Supercomputing
Porting a user-level communication architecture to NT: experiences and performance

WINSYM'99 Proceedings of the 3rd conference on USENIX Windows NT Symposium - Volume 3
RDMA in the SiCortex cluster systems

PVM/MPI'07 Proceedings of the 14th European conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
The nonkernel: a kernel designed for the cloud

Proceedings of the 4th Asia-Pacific Workshop on Systems
Spread Identity: A new dynamic address remapping mechanism for anonymity and DDoS defense

Journal of Computer Security

Quantified Score

Hi-index	0.00

Visualization

Abstract

An important aspect of a high-speed network system is the ability to transfer data directly between the network interface and application buffers. Such a direct data path requires the network interface to "know" the virtual-to-physical address translation of a user buffer, i.e., the physical memory location of the buffer. This paper presents an efficient address translation architecture, User-managed TLB (UTLB), which eliminates system calls and device interrupts from the common communication path. UTLB also supports application-specific policies to pin and unpin application memory. We report micro-benchmark results for an implementation on Myrinet PC clusters. A trace-driven analysis is used to compare the UTLB approach with the interrupt-based approach. It is also used to study the effects of UTLB cache size, associativity, and prefetching. Our results show that the UTLB approach delivers robust performance with relatively small translation cache sizes.