Tunable randomization for load management in shared-disk clusters

Authors:
Changxun Wu;Randal Burns
Affiliations:
Johns Hopkins University, Baltimore, MD;Johns Hopkins University, Baltimore, MD
Venue:
ACM Transactions on Storage (TOS)
Year:
2005

Citing 34
Cited 2

Adaptive load sharing in homogeneous distributed systems

IEEE Transactions on Software Engineering
Scale and performance in a distributed file system

ACM Transactions on Computer Systems (TOCS)
A Trace-Driven Simulation Study of Dynamic Load Balancing

IEEE Transactions on Software Engineering
Coda: A Highly Available File System for a Distributed Workstation Environment

IEEE Transactions on Computers
Utopia: a load sharing facility for large, heterogeneous distributed computer systems

Software—Practice & Experience
A scalable HTTP server: the NCSA prototype

Selected papers of the first conference on World-Wide Web
Load balancing and fault tolerance in workstation clusters migrating groups of communicating processes

ACM SIGOPS Operating Systems Review
Long term distributed file reference tracing: implementation and experience

Software—Practice & Experience
LH*—a scalable, distributed data structure

ACM Transactions on Database Systems (TODS)
On the analysis of randomized load balancing schemes

Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Exploiting process lifetime distributions for dynamic load balancing

ACM Transactions on Computer Systems (TOCS)
Frangipani: a scalable distributed file system

Proceedings of the sixteenth ACM symposium on Operating systems principles
How Useful Is Old Information?

IEEE Transactions on Parallel and Distributed Systems
Chord: A scalable peer-to-peer lookup service for internet applications

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
A scalable content-addressable network

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
Wide-area cooperative storage with CFS

SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
The Power of Two Choices in Randomized Load Balancing

IEEE Transactions on Parallel and Distributed Systems
Load Balancing in Parallel Computers: Theory and Practice

Load Balancing in Parallel Computers: Theory and Practice
The MOSIX Distributed Operating System: Load Balancing for UNIX

The MOSIX Distributed Operating System: Load Balancing for UNIX
Compact, adaptive placement schemes for non-uniform requirements

Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures
A Case for NOW (Networks of Workstations)

IEEE Micro
Strategies for Dynamic Load Balancing on Highly Parallel Computers

IEEE Transactions on Parallel and Distributed Systems
A Practical Approach to Dynamic Load Balancing

IEEE Transactions on Parallel and Distributed Systems
GPFS: A Shared-Disk File System for Large Computing Clusters

FAST '02 Proceedings of the Conference on File and Storage Technologies
Run-Time Support for Adaptive Load Balancing

IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems

Middleware '01 Proceedings of the IFIP/ACM International Conference on Distributed Systems Platforms Heidelberg
Adaptive Runtime Partitioning of AMR Applications on Heterogeneous Clusters

CLUSTER '01 Proceedings of the 3rd IEEE International Conference on Cluster Computing
Dynamic Virtual Clusters in a Grid Site Manager

HPDC '03 Proceedings of the 12th IEEE International Symposium on High Performance Distributed Computing
Geographic Load Balancing for Scalable Distributed Web Systems

MASCOTS '00 Proceedings of the 8th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems
An adaptive load balancing algorithm for heterogeneous distributed systems with multiple task classes

ICDCS '96 Proceedings of the 16th International Conference on Distributed Computing Systems (ICDCS '96)
Honey, I Shrunk the Beowulf!

ICPP '02 Proceedings of the 2002 International Conference on Parallel Processing
IBM Storage Tank-- A heterogeneous scalable SAN file system

IBM Systems Journal
Interposed request routing for scalable network storage

OSDI'00 Proceedings of the 4th conference on Symposium on Operating System Design & Implementation - Volume 4
Dynamic function placement for data-intensive cluster computing

ATEC '00 Proceedings of the annual conference on USENIX Annual Technical Conference

Workload-based generation of administrator hints for optimizing database storage utilization

ACM Transactions on Storage (TOS)
MHS: A distributed metadata management strategy

Journal of Systems and Software

Quantified Score

Hi-index	0.00

Visualization

Abstract

We develop and evaluate a system for load management in shared-disk file systems built on clusters of heterogeneous computers. It balances workload by moving file sets among cluster server nodes. It responds to changing server resources that arise from failure and recovery, and dynamically adding or removing servers. It also realizes performance consistency---nearly uniform performance across all servers. The system is adaptive and self-tuning. It operates without any a priori knowledge of workload properties, or the capabilities of the servers. Rather, it continuously tunes load placement using a technique called adaptive, nonuniform (ANU) randomization. ANU randomization realizes the scalability and metadata reduction benefits of hash-based, randomized placement techniques, while avoiding hashing's drawbacks: load skew, inability to cope with heterogeneity, and lack of tunability. ANU randomization outperforms virtual-processor approaches to load balancing, while reducing the amount of shared state among servers and the amount of load movement.