Dynamic load balancing for I/O-intensive applications on clusters

Authors:
Xiao Qin;Hong Jiang;Adam Manzanares;Xiaojun Ruan;Shu Yin
Affiliations:
Auburn University, AL;University of Nebraska, Lincoln;Auburn University, AL;Auburn University, AL;Auburn University, AL
Venue:
ACM Transactions on Storage (TOS)
Year:
2009

Citing 38
Cited 4

Adaptive load sharing in homogeneous distributed systems

IEEE Transactions on Software Engineering
Synchronized Disk Interleaving

IEEE Transactions on Computers
Exploiting process lifetime distributions for dynamic load balancing

Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Effective distributed scheduling of parallel workloads

Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Managing server load in global memory systems

SIGMETRICS '97 Proceedings of the 1997 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Requirements of I/O systems for parallel machines: an application-driven study

Requirements of I/O systems for parallel machines: an application-driven study
Availability and utility of idle memory in workstation clusters

SIGMETRICS '99 Proceedings of the 1999 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Tight Bounds for Prefetching and Buffer Management Algorithms for Parallel I/O Systems

IEEE Transactions on Parallel and Distributed Systems
File Assignment in Parallel I/O Systems with Minimal Variance of Service Time

IEEE Transactions on Computers
OPIOM: off-processor I/O with myrinet

Future Generation Computer Systems - Best papers from symp. on cluster computing and the grid (CCGRID 2001)
Performance Analysis of a Distributed Question/Answering System

IEEE Transactions on Parallel and Distributed Systems
Dynamic file-access characteristics of a production parallel scientific workload

Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Dynamic I/O characterization of I/O intensive scientific applications

Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Improved Strategies for Dynamic Load Balancing

IEEE Concurrency
On Load Balancing for Distributed Multiagent Computing

IEEE Transactions on Parallel and Distributed Systems
Opportunity Cost Algorithms for Reduction of I/O and Interprocess Communication Overhead in a Computing Cluster

IEEE Transactions on Parallel and Distributed Systems
Using Disk Throughput Data in Predictions of End-to-End Grid Data Transfers

GRID '02 Proceedings of the Third International Workshop on Grid Computing
Titan: A High-Performance Remote Sensing Database

ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
Faster Collective Output through Active Buffering

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
A Comparison Study of Static Mapping Heuristics for a Class of Meta-Tasks on Heterogeneous Computing Systems

HCW '99 Proceedings of the Eighth Heterogeneous Computing Workshop
WSCLOCK—a simple and effective algorithm for virtual memory management

SOSP '81 Proceedings of the eighth ACM symposium on Operating systems principles
Managing Network Resources in Condor

HPDC '00 Proceedings of the 9th IEEE International Symposium on High Performance Distributed Computing
Resource-Aware Stream Management with the Customizable dproc Distributed Monitoring Mechanisms

HPDC '03 Proceedings of the 12th IEEE International Symposium on High Performance Distributed Computing
Effective Load Sharing on Heterogeneous Networks of Workstations

IPDPS '00 Proceedings of the 14th International Symposium on Parallel and Distributed Processing
Improving Distributed Workload Performance by Sharing Both CPU and Memory Resources

ICDCS '00 Proceedings of the The 20th International Conference on Distributed Computing Systems ( ICDCS 2000)
Towards Communication-Sensitive Load Balancing

ICDCS '01 Proceedings of the The 21st International Conference on Distributed Computing Systems
The Home Model and Competitive Algorithms for Load Balancing in a Computing Cluster

ICDCS '01 Proceedings of the The 21st International Conference on Distributed Computing Systems
The Google file system

SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Use of PVFS for Efficient Execution of Jobs with Pipeline-Shared I/O

GRID '04 Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing
Storage-Aware Caching: Revisiting Caching for Heterogeneous Storage Systems

FAST '02 Proceedings of the 1st USENIX Conference on File and Storage Technologies
Process Migration for MPI Applications based on Coordinated Checkpoint

ICPADS '05 Proceedings of the 11th International Conference on Parallel and Distributed Systems - Volume 01
The portable batch scheduler and the maui scheduler on linux clusters

ALS'00 Proceedings of the 4th annual Linux Showcase & Conference - Volume 4
PVFS: a parallel file system for linux clusters

ALS'00 Proceedings of the 4th annual Linux Showcase & Conference - Volume 4
Performance comparisons of load balancing algorithms for I/O-intensive workloads on clusters

Journal of Network and Computer Applications
MapReduce: simplified data processing on large clusters

Communications of the ACM - 50th anniversary issue: 1958 - 2008
A case study of parallel I/O for biological sequence search on Linux clusters

International Journal of High Performance Computing and Networking
Performance Driven Partial Checkpoint/Migrate for LAM-MPI

HPCS '08 Proceedings of the 2008 22nd International Symposium on High Performance Computing Systems and Applications
Monitoring MPI running nodes status for load balance

GCC'05 Proceedings of the 4th international conference on Grid and Cooperative Computing

Buffer cache de-duplication for query dispatch in replicated databases

DASFAA'11 Proceedings of the 16th international conference on Database systems for advanced applications: Part II
A cost-based database request distribution technique for online e-commerce applications

MIS Quarterly
Developing an optimized application hosting framework in Clouds

Journal of Computer and System Sciences
A MapReduce task scheduling algorithm for deadline constraints

Cluster Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Load balancing for clusters has been investigated extensively, mainly focusing on the effective usage of global CPU and memory resources. However, previous CPU- or memory-centric load balancing schemes suffer significant performance drop under I/O-intensive workloads due to the imbalance of I/O load. To solve this problem, we propose two simple yet effective I/O-aware load-balancing schemes for two types of clusters: (1) homogeneous clusters where nodes are identical and (2) heterogeneous clusters, which are comprised of a variety of nodes with different performance characteristics in computing power, memory capacity, and disk speed. In addition to assigning I/O-intensive sequential and parallel jobs to nodes with light I/O loads, the proposed schemes judiciously take into account both CPU and memory load sharing in the system. Therefore, our schemes are able to maintain high performance for a wide spectrum of workloads. We develop analytic models to study mean slowdowns, task arrival, and transfer processes in system levels. Using a set of real I/O-intensive parallel applications and synthetic parallel jobs with various I/O characteristics, we show that our proposed schemes consistently improve the performance over existing non-I/O-aware load-balancing schemes, including CPU- and Memory-aware schemes and a PBS-like batch scheduler for parallel and sequential jobs, for a diverse set of workload conditions. Importantly, this performance improvement becomes much more pronounced when the applications are I/O-intensive. For example, the proposed approaches deliver 23.6--88.0 % performance improvements for I/O-intensive applications such as LU decomposition, Sparse Cholesky, Titan, Parallel text searching, and Data Mining. When I/O load is low or well balanced, the proposed schemes are capable of maintaining the same level of performance as the existing non-I/O-aware schemes.