Combinatorial optimization: algorithms and complexity
Combinatorial optimization: algorithms and complexity
Improved parallel I/O via a two-phase run-time access strategy
ACM SIGARCH Computer Architecture News - Special issue on input/output in parallel computer systems
Efficient Nonblocking Switching Networks for Interprocessor Communications in Multiprocessor Systems
IEEE Transactions on Parallel and Distributed Systems
Resource Placement with Multiple Adjacency Constraints in k-ary n-Cubes
IEEE Transactions on Parallel and Distributed Systems
An online video placement policy based on bandwidth to space ratio (BSR)
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Server-directed collective I/O in Panda
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
PPFS: a high performance portable parallel file system
ICS '95 Proceedings of the 9th international conference on Supercomputing
Resource Allocation in Cube Network Systems Based on the Covering Radius
IEEE Transactions on Parallel and Distributed Systems
Exploiting local data in parallel array I/O on a practical network of workstations
Proceedings of the fifth workshop on I/O in parallel and distributed systems
Resource Placement in Torus-Based Networks
IEEE Transactions on Computers
Galley: a new parallel file system for scientific applications
Galley: a new parallel file system for scientific applications
VIP-FS: a VIrtual, Parallel File System for high performance parallel and distributed computing
IPPS '95 Proceedings of the 9th International Symposium on Parallel Processing
A Software Architecture for Massively Parallel Input-Output
PARA '96 Proceedings of the Third International Workshop on Applied Parallel Computing, Industrial Computation and Optimization
Stampi-I/O: A Flexible Parallel-I/O Library for Heterogeneous Computing Environment
Proceedings of the 9th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Placement of I/O servers to improve parallel I/O performance on switch-based clusters
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Load management in distributed video servers
ICDCS '97 Proceedings of the 17th International Conference on Distributed Computing Systems (ICDCS '97)
Optimizing I/O server placement for parallel I/O on switch-based irregular networks
The Journal of Supercomputing
Exploiting Lustre File Joining for Effective Collective IO
CCGRID '07 Proceedings of the Seventh IEEE International Symposium on Cluster Computing and the Grid
A Distributed Shared Parallel IO System for HPC
ITNG '08 Proceedings of the Fifth International Conference on Information Technology: New Generations
Flexible i/o support for reconfigurable grid environments
Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
MMPacking: a load and storage balancing algorithm for distributed multimedia servers
IEEE Transactions on Circuits and Systems for Video Technology
Hi-index | 0.00 |
In this paper, we consider how to optimize I/O server placement in order to improve parallel I/O performance in switch-based clusters. The significant advances in cluster networks in recent years have made it practical to connect tens of thousands of hosts via networks that have enormous and scalable total capacity, and in which communications between a host and any other host incur the same cost. The same cost property frees users from consideration of network contention and allows them to concentrate on load-balancing issues. We formulate the server placement problem on a cluster that has the same cost property as a weighted bipartite matching with the goal of balancing the workload on the I/O nodes. To find an optimal solution to this problem, we propose an O(n^3^2m(logn+logm)) algorithm, called Load Balance Matching (LBM), where n is the number of compute nodes and m is the number of I/O servers. We also investigate server placement for general clusters in which multiple same-cost subclusters are interconnected to form a large cluster. This class of clusters typically adopt irregular topologies that allow the construction of scalable systems with an incremental expansion capability. Also, due to the limited bandwidth on network links between subclusters, network link contention is a major concern when distributing servers over the entire network. We show that finding an optimal placement strategy for general clusters with the goal of minimizing link contention is computationally intractable. To resolve this problem, we propose a hierarchical strategy that places servers in two steps. First, to minimize link contention, we decide which subcluster each server should be assigned to. We propose a tree-based heuristic algorithm, called Load Balance Traversing (LBT), to solve this problem. In the second step, the LBM algorithm decides the location of each server within a subcluster. Our simulation results demonstrate that LBT achieves a significant improvement in parallel I/O performance over four other algorithms, and is near-optimal in some cases.