Dodo: A User-Level System for Exploiting Idle Memory in Workstation Clusters

Authors:
Samir Koussih;Anurag Acharya;Sanjeev Setia
Affiliations:
-;-;-
Venue:
HPDC '99 Proceedings of the 8th IEEE International Symposium on High Performance Distributed Computing
Year:
1999

Citing 0
Cited 4

Efficient Memory Page Replacement on Web Server Clusters

ICCS '02 Proceedings of the International Conference on Computational Science-Part III
Accelerating tropical cyclone analysis using LambdaRAM, a distributed data cache over wide-area ultra-fast networks

Future Generation Computer Systems
Disaggregated memory for expansion and sharing in blade servers

Proceedings of the 36th annual international symposium on Computer architecture
Distributed anemone: transparent low-latency access to remote memory

HiPC'06 Proceedings of the 13th international conference on High Performance Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we present the design and implementation of {\em Dodo}, an efficient user-level system for harvesting idle memory in off-the-shelf clusters of workstations. {\em Dodo} enables data-intensive applications to use remote memory in a cluster as an intermediate cache between local memory and disk. It requires no modifications to the operating system and/or processor firmware and is hence portable to multiple platforms. Further, the memory recruitment policy used by {\em Dodo} is designed to minimize any delays experienced by the owner of desktop machines whose memory is harvested by {\em Dodo}.Our implementation of {\em Dodo} is operational and currently runs on Linux~2.0.35. For communication, {\em Dodo} can use either UDP/IP or {\em U-Net}, the low-latency user-level network architecture developed by von~Eicken~et~al~\cite{BasuBVE95}. We evaluated the performance improvements that can be achieved by using {\em Dodo} for two real applications and three synthetic benchmarks. Our results show that speedups obtained for an application are highly dependent on its I/O access pattern and data set sizes. Significant speedups (between 2 and 3) were obtained for applications whose working sets are larger than the local memory on a workstation but smaller than aggregate memory available on the cluster and for applications that can benefit from the zero-seek nature of remote memory.