Incorporating Job Migration and Network RAM to Share Cluster Memory Resources

Authors:
Li Xiao;Xiaodong Zhang;Stefan A. Kubricht
Affiliations:
-;-;-
Venue:
HPDC '00 Proceedings of the 9th IEEE International Symposium on High Performance Distributed Computing
Year:
2000

Citing 0
Cited 5

Dynamic Load Sharing With Unknown Memory Demands in Clusters

ICDCS '01 Proceedings of the The 21st International Conference on Distributed Computing Systems
On scalable and locality-aware web document sharing

Journal of Parallel and Distributed Computing - Scalable web services and architecture
Adaptive Memory Allocations in Clusters to Handle Unexpectedly Large Data-Intensive Jobs

IEEE Transactions on Parallel and Distributed Systems
Effectively Utilizing Global Cluster Memory for Large Data-Intensive Parallel Programs

IEEE Transactions on Parallel and Distributed Systems
Energy optimization schemes in cluster with virtual machines

Cluster Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Job migrations and network RAM are two major approaches for effectively using global memory resources in a workstation cluster, aimed at reducing page faults in each local workstation and improving the overall performance of cluster computing. Using remote executions or preemptive migrations, a load sharing system is able to migrate a job from a workstation without sufficient memory space to a lightly loaded workstation with large idle memory space for the migrated job. In a network RAM system, if a job cannot find sufficient memory space for its working sets, it will utilize idle memory space from other workstations in the cluster through remote paging. Conducting trace-driven simulations, we have compared the performance and trade-offs of the two approaches and their impacts on job execution time and cluster scalability. Our study indicates that job-migration-based load sharing schemes are able to balance executions of jobs in a cluster well, while network RAM is able to satisfy data-intensive jobs which may not be migratable by sharing all the idle memory resources in a cluster. We also show that a network RAM cluster of workstations is scalable only if the network is sufficiently fast. Finally, we propose an improved load-sharing scheme by combining job migrations with network RAM for cluster computing. This scheme uses remote execution to initially allocate a job to the most lightly loaded workstation and, if necessary, network RAM to provide a larger memory space for the job than would be available otherwise. The improved scheme has the merits of both job migrations and network RAM. Our experiments show its effectiveness and scalability for cluster computing.