Cheating the I/O bottleneck: network storage with Trapeze/Myrinet

  • Authors:
  • Darrell C. Anderson;Jeffrey S. Chase;Syam Gadde;Andrew J. Gallatin;Kenneth G. Yocum;Michael J. Feeley

  • Affiliations:
  • Department of Computer Science, Duke University;Department of Computer Science, Duke University;Department of Computer Science, Duke University;Department of Computer Science, Duke University;Department of Computer Science, Duke University;Department of Computer Science, University of British Columbia

  • Venue:
  • ATEC '98 Proceedings of the annual conference on USENIX Annual Technical Conference
  • Year:
  • 1998

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recent advances in I/O bus structures (e.g., PCI), high-speed networks, and fast, cheap disks have significantly expanded the I/O capacity of desktop-class systems. This paper describes a messaging system designed to deliver the potential of these advances for network storage systems including cluster file systems and network memory. We describe gms_net, an RPC-like kernel-kernel messaging system based on Trapeze, a new firmware program for Myrinet network interfaces. We show how the communication features of Trapeze and gms_net are used by the Global Memory Service (GMS), a kernel-based network memory system. The paper focuses on support for zero-copy page migration in GMS/Trapeze using two RPC variants important for peer-peer distributed services: (1) delegated RPC in which a request is delegated to a third party, and (2) nonblocking RPC in which replies are processed from the Trapeze receive interrupt handler. We present measurements of sequential file access from network memory in the GMS/Trapeze prototype on a Myrinet/Alpha cluster, showing the bandwidth effects of file system interfaces and communication choices. GMS/Trapeze delivers a peak read bandwidth of 96 MB/s using memory-mapped file I/O.