Cheating the I/O bottleneck: network storage with Trapeze/Myrinet

Authors:
Darrell C. Anderson;Jeffrey S. Chase;Syam Gadde;Andrew J. Gallatin;Kenneth G. Yocum;Michael J. Feeley
Affiliations:
Department of Computer Science, Duke University;Department of Computer Science, Duke University;Department of Computer Science, Duke University;Department of Computer Science, Duke University;Department of Computer Science, Duke University;Department of Computer Science, University of British Columbia
Venue:
ATEC '98 Proceedings of the annual conference on USENIX Annual Technical Conference
Year:
1998

Citing 18
Cited 21

The V distributed system

Communications of the ACM
Using continuations to implement thread management and communication in operating systems

SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
The Zebra striped network file system

SOSP '93 Proceedings of the fourteenth ACM symposium on Operating systems principles
Optimistic active messages: a mechanism for scheduling communication with computation

PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Serverless network file systems

SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Rover: a toolkit for mobile information access

SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Implementing global memory management in a workstation cluster

SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
The design and implementation of the 4.4BSD operating system

The design and implementation of the 4.4BSD operating system
Reducing network latency using subpages in a global memory environment

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
An implementation of the Hamlyn sender-managed interface architecture

OSDI '96 Proceedings of the second USENIX symposium on Operating systems design and implementation
Effects of buffering semantics on I/O performance

OSDI '96 Proceedings of the second USENIX symposium on Operating systems design and implementation
Frangipani: a scalable distributed file system

Proceedings of the sixteenth ACM symposium on Operating systems principles
Implementing cooperative prefetching and caching in a globally-managed memory system

SIGMETRICS '98/PERFORMANCE '98 Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Fast Messages: Efficient, Portable Communication for Workstation Clusters and MPPs

IEEE Parallel & Distributed Technology: Systems & Technology
Myrinet: A Gigabit-per-Second Local Area Network

IEEE Micro
Cut-through delivery in Trapeze: An exercise in low-latency messaging

HPDC '97 Proceedings of the 6th IEEE International Symposium on High Performance Distributed Computing
The APIC Approach to High Performance Network Interface Design: Protected DMA and Other Techniques

INFOCOM '97 Proceedings of the INFOCOM '97. Sixteenth Annual Joint Conference of the IEEE Computer and Communications Societies. Driving the Information Revolution
Zero-copy TCP in Solaris

ATEC '96 Proceedings of the 1996 annual conference on USENIX Annual Technical Conference

Implementing cooperative prefetching and caching in a globally-managed memory system

SIGMETRICS '98/PERFORMANCE '98 Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Experiences with VI communication for database storage

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
The Network RamDisk: Using remote memory on heterogeneous NOWs

Cluster Computing
On using network RAM as a non-volatile buffer

Cluster Computing
Failure-Atomic File Access in the Slice Interposed Network Storage System

Cluster Computing
Bottleneck Analysis of a Gigabit Network Interface Card: Formal Verification Approach

Proceedings of the 9th International SPIN Workshop on Model Checking of Software
Structure and Performance of the Direct Access File System

ATEC '02 Proceedings of the General Track of the annual conference on USENIX Annual Technical Conference
Payload Caching: High-Speed Data Forwarding for Network Intermediaries

Proceedings of the General Track: 2002 USENIX Annual Technical Conference
Flexible and Optimized IDL Compilation for Distributed Applications

LCR '98 Selected Papers from the 4th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
Application performance on the Direct Access File System

WOSP '04 Proceedings of the 4th international workshop on Software and performance
VI-Attached Database Storage

IEEE Transactions on Parallel and Distributed Systems
Making the Most Out of Direct-Access Network Attached Storage

FAST '03 Proceedings of the 2nd USENIX Conference on File and Storage Technologies
Flexible IDL compilation for complex communication patterns[1]

Scientific Programming
Lazy asynchronous I/O for event-driven servers

ATEC '04 Proceedings of the annual conference on USENIX Annual Technical Conference
Impact of protocol overheads on network throughput over high-speed interconnects: measurement, analysis, and improvement

The Journal of Supercomputing
An application-aware data storage model

ATEC '99 Proceedings of the annual conference on USENIX Annual Technical Conference
Trapeze/IP: TCP/IP at near-gigabit speeds

ATEC '99 Proceedings of the annual conference on USENIX Annual Technical Conference
Martini: A Network Interface Controller Chip for High Performance Computing with Distributed PCs

IEEE Transactions on Parallel and Distributed Systems
Making the most out of direct-access network attached storage

FAST'03 Proceedings of the 2nd USENIX conference on File and storage technologies
Operating system support for multimedia systems

Computer Communications
Compression-aware I/O performance analysis for big data clustering

Proceedings of the 1st International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recent advances in I/O bus structures (e.g., PCI), high-speed networks, and fast, cheap disks have significantly expanded the I/O capacity of desktop-class systems. This paper describes a messaging system designed to deliver the potential of these advances for network storage systems including cluster file systems and network memory. We describe gms_net, an RPC-like kernel-kernel messaging system based on Trapeze, a new firmware program for Myrinet network interfaces. We show how the communication features of Trapeze and gms_net are used by the Global Memory Service (GMS), a kernel-based network memory system. The paper focuses on support for zero-copy page migration in GMS/Trapeze using two RPC variants important for peer-peer distributed services: (1) delegated RPC in which a request is delegated to a third party, and (2) nonblocking RPC in which replies are processed from the Trapeze receive interrupt handler. We present measurements of sequential file access from network memory in the GMS/Trapeze prototype on a Myrinet/Alpha cluster, showing the bandwidth effects of file system interfaces and communication choices. GMS/Trapeze delivers a peak read bandwidth of 96 MB/s using memory-mapped file I/O.