Reducing network latency using subpages in a global memory environment

  • Authors:
  • Hervé A. Jamrozik;Michael J. Feeley;Geoffrey M. Voelker;James Evans, II;Anna R. Karlin;Henry M. Levy;Mary K. Vernon

  • Affiliations:
  • Department of Computer Science and Engineering, University of Washington;Department of Computer Science and Engineering, University of Washington;Department of Computer Science and Engineering, University of Washington;Department of Computer Science and Engineering, University of Washington;Department of Computer Science and Engineering, University of Washington;Department of Computer Science and Engineering, University of Washington;Department of Computer Science and Engineering, University of Washington

  • Venue:
  • Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
  • Year:
  • 1996

Quantified Score

Hi-index 0.00

Visualization

Abstract

New high-speed networks greatly encourage the use of network memory as a cache for virtual memory and file pages, thereby reducing the need for disk access. Because pages are the fundamental transfer and access units in remote memory systems, page size is a key performance factor. Recently, page sizes of modern processors have been increasing in order to provide more TLB coverage and amortize disk access costs. Unfortunately, for high-speed networks, small transfers are needed to provide low latency. This trend in page size is thus at odds with the use of network memory on high-speed networks.This paper studies the use of subpages as a means of reducing transfer size and latency in a remote-memory environment. Using trace-driven simulation, we show how and why subpages reduce latency and improve performance of programs using network memory. Our results show that memory-intensive applications execute up to 1.8 times faster when executing with 1K-byte subpages, when compared to the same applications using full 8K-byte pages in the global memory system. Those same applications using 1K-byte subpages execute up to 4 times faster than they would using the disk for backing store. Using a prototype implementation on the DEC Alpha and AN2 network, we demonstrate how subpages can reduce remote-memory fault time; e.g., our prototype is able to satisfy a fault on a 1K subpage stored in remote memory in 0.5 milliseconds, one third the time of a full page.