NUMA policies and their relation to memory architecture
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
The robustness of NUMA memory management
SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
The LRU-K page replacement algorithm for database disk buffering
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Scheduling and page migration for multiprocessor compute servers
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Informed prefetching and caching
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Application and architectural bottlenecks in large scale distributed shared memory machines
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
ACM Transactions on Computer Systems (TOCS)
Operating system support for improving data locality on CC-NUMA compute servers
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Adaptive page replacement based on memory reference behavior
SIGMETRICS '97 Proceedings of the 1997 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
An optimality proof of the LRU-K page replacement algorithm
Journal of the ACM (JACM)
EELRU: simple and effective adaptive page replacement
SIGMETRICS '99 Proceedings of the 1999 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
IEEE Transactions on Parallel and Distributed Systems
A case for user-level dynamic page migration
Proceedings of the 14th international conference on Supercomputing
Proceedings of the 2000 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Compiler-based I/O prefetching for out-of-core applications
ACM Transactions on Computer Systems (TOCS)
SIGMETRICS '02 Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Dynamic page placement to improve locality in CC-NUMA multiprocessors for TPC-C
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
IEEE Transactions on Computers
IPPS '95 Proceedings of the 9th International Symposium on Parallel Processing
2Q: A Low Overhead High Performance Buffer Management Replacement Algorithm
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
The Multi-Queue Replacement Algorithm for Second Level Buffer Caches
Proceedings of the General Track: 2002 USENIX Annual Technical Conference
LCR '00 Selected Papers from the 5th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
WSCLOCK—a simple and effective algorithm for virtual memory management
SOSP '81 Proceedings of the eighth ACM symposium on Operating systems principles
User-Level Dynamic Page Migration for Multiprogrammed Shared-Memory Multiprocessors
ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
ARC: A Self-Tuning, Low Overhead Replacement Cache
FAST '03 Proceedings of the 2nd USENIX Conference on File and Storage Technologies
CLOCK-Pro: an effective improvement of the CLOCK replacement
ATEC '05 Proceedings of the annual conference on USENIX Annual Technical Conference
OSDI'00 Proceedings of the 4th conference on Symposium on Operating System Design & Implementation - Volume 4
Program-counter-based pattern classification in buffer caching
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Hi-index | 0.00 |
CC-NUMA is a widely adopted and deployed architecture of high performance computers. These machines are attractive for their transparent access to local and remote memory. However, the prohibitive latency gap between local and remote access deteriorates applications' performance seriously due to memory access stalls. File system cache, especially, being shared by all processes, inevitably triggers many remote accesses. To address this problem, we suggest and implement a mechanism that uses local memory to cache remote file cache, of which the main purpose is to improve data locality. Using realistic workload on a two-node cc-NUMA machine, we show that the cost of such a mechanism is as low as 0.5%, the performance can be increased 14.3% at most, and the local hit ratio can be improved as much as 40%.