ACM Computing Surveys (CSUR) - The MIT Press scientific computation series
A New Memory Monitoring Scheme for Memory-Aware Scheduling and Partitioning
HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
Cooperative Caching for Chip Multiprocessors
Proceedings of the 33rd annual international symposium on Computer Architecture
Implementation of a reliable remote memory pager
ATEC '96 Proceedings of the 1996 annual conference on USENIX Annual Technical Conference
Collaborative Memory Pool in Cluster System
ICPP '07 Proceedings of the 2007 International Conference on Parallel Processing
Scalable high performance main memory system using phase-change memory technology
Proceedings of the 36th annual international symposium on Computer architecture
Disaggregated memory for expansion and sharing in blade servers
Proceedings of the 36th annual international symposium on Computer architecture
The multikernel: a new OS architecture for scalable multicore systems
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
The case for RAMClouds: scalable high-performance storage entirely in DRAM
ACM SIGOPS Operating Systems Review
Morphable memory system: a robust architecture for exploiting multi-level phase change memories
Proceedings of the 37th annual international symposium on Computer architecture
FlexSC: flexible system call scheduling with exception-less system calls
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Page placement in hybrid memory systems
Proceedings of the international conference on Supercomputing
Evaluating placement policies for managing capacity sharing in CMP architectures with private caches
ACM Transactions on Architecture and Code Optimization (TACO)
A collaborative memory system for high-performance and cost-effective clustered architectures
Proceedings of the 1st Workshop on Architectures and Systems for Big Data
Hi-index | 0.00 |
With the fast development of highly-integrated distributed systems (cluster systems), designers face interesting memory hierarchy design choices while attempting to avoid the notorious disk swapping. Swapping to the free remote memory through Memory Collaboration has demonstrated its cost-effectiveness compared to over provisioning the cluster for peak load requirements. Recent memory collaboration studies propose several ways on accessing the under-utilized remote memory in static system configurations, without detailed exploration of the dynamic memory collaboration. Dynamic collaboration is an important aspect given the run-time memory usage fluctuations in clustered systems. Further, as the interest in memory collaboration grows, it is crucial to understand the existing performance bottlenecks, overheads, and potential optimization. In this paper we address these two issues. First, we propose an Autonomous Collaborative Memory System (ACMS) that manages memory resources dynamically at run time to optimize performance. We implement a prototype realizing the proposed ACMS, experiment with a wide range of real-world applications, and show up to 3x performance speedup compared to a non-collaborative memory system without perceivable performance impact on nodes that provide memory. Second, we analyze, in depth, the end-to-end memory collaboration overhead and pinpoint the corresponding bottlenecks.