File access performance of diskless workstations
ACM Transactions on Computer Systems (TOCS)
801 storage: architecture and programming
ACM Transactions on Computer Systems (TOCS)
Modula-3
Alpha architecture reference manual
Alpha architecture reference manual
Active messages: a mechanism for integrated communication and computation
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Tradeoffs in supporting two page sizes
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
High-speed switch scheduling for local-area networks
ACM Transactions on Computer Systems (TOCS)
The Wisconsin Wind Tunnel: virtual prototyping of parallel computers
SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Separating data and control transfer in distributed operating systems
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Trap-driven simulation with Tapeworm II
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Surpassing the TLB performance of superpages with less operating system support
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Implementing global memory management in a workstation cluster
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Reducing TLB and memory overhead using online superpage promotion
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Early experience with message-passing on the SHRIMP multicomputer
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Fast rendering of complex environments using a spatial hierarchy
GI '96 Proceedings of the conference on Graphics interface '96
Global Memory Management in Client-Server Database Architectures
VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Kernel Support for the Wisconsin Wind Tunnel
USENIX Microkernels and Other Kernel Architectures Symposium
Managing server load in global memory systems
SIGMETRICS '97 Proceedings of the 1997 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Modeling communication pipeline latency
SIGMETRICS '98/PERFORMANCE '98 Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Implementing cooperative prefetching and caching in a globally-managed memory system
SIGMETRICS '98/PERFORMANCE '98 Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Techniques for energy minimization of communication pipelines
Proceedings of the 1998 IEEE/ACM international conference on Computer-aided design
MultiView and Millipage — fine-grain sharing in page-based DSMs
OSDI '99 Proceedings of the third symposium on Operating systems design and implementation
Exploiting NIC architectural support for enhancing IP-based protocols on high-performance networks
Journal of Parallel and Distributed Computing - Special issue: Design and performance of networks for super-, cluster-, and grid-computing: Part II
An energy-efficient virtual memory system with flash memory as the secondary storage
Proceedings of the 2006 international symposium on Low power electronics and design
Cheating the I/O bottleneck: network storage with Trapeze/Myrinet
ATEC '98 Proceedings of the annual conference on USENIX Annual Technical Conference
Energy-aware flash memory management in virtual memory system
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
SIROCCO'07 Proceedings of the 14th international conference on Structural information and communication complexity
Adaptive memory system over ethernet
HotStorage'10 Proceedings of the 2nd USENIX conference on Hot topics in storage and file systems
A distributed paging RAM grid system for wide-area memory sharing
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Hi-index | 0.00 |
New high-speed networks greatly encourage the use of network memory as a cache for virtual memory and file pages, thereby reducing the need for disk access. Because pages are the fundamental transfer and access units in remote memory systems, page size is a key performance factor. Recently, page sizes of modern processors have been increasing in order to provide more TLB coverage and amortize disk access costs. Unfortunately, for high-speed networks, small transfers are needed to provide low latency. This trend in page size is thus at odds with the use of network memory on high-speed networks.This paper studies the use of subpages as a means of reducing transfer size and latency in a remote-memory environment. Using trace-driven simulation, we show how and why subpages reduce latency and improve performance of programs using network memory. Our results show that memory-intensive applications execute up to 1.8 times faster when executing with 1K-byte subpages, when compared to the same applications using full 8K-byte pages in the global memory system. Those same applications using 1K-byte subpages execute up to 4 times faster than they would using the disk for backing store. Using a prototype implementation on the DEC Alpha and AN2 network, we demonstrate how subpages can reduce remote-memory fault time; e.g., our prototype is able to satisfy a fault on a 1K subpage stored in remote memory in 0.5 milliseconds, one third the time of a full page.