Combinatorial optimization: algorithms and complexity
Combinatorial optimization: algorithms and complexity
Improved parallel I/O via a two-phase run-time access strategy
ACM SIGARCH Computer Architecture News - Special issue on input/output in parallel computer systems
Server-directed collective I/O in Panda
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
PPFS: a high performance portable parallel file system
ICS '95 Proceedings of the 9th international conference on Supercomputing
Exploiting local data in parallel array I/O on a practical network of workstations
Proceedings of the fifth workshop on I/O in parallel and distributed systems
Galley: a new parallel file system for scientific applications
Galley: a new parallel file system for scientific applications
VIP-FS: a VIrtual, Parallel File System for high performance parallel and distributed computing
IPPS '95 Proceedings of the 9th International Symposium on Parallel Processing
A Software Architecture for Massively Parallel Input-Output
PARA '96 Proceedings of the Third International Workshop on Applied Parallel Computing, Industrial Computation and Optimization
PVFS: a parallel file system for linux clusters
ALS'00 Proceedings of the 4th annual Linux Showcase & Conference - Volume 4
Adaptive parallel I/O scheduling algorithm for multiprogrammed systems
Future Generation Computer Systems - Parallel input/output management techniques (PIOMT) in cluster and grid computing
Optimizing server placement for parallel I/O in switch-based clusters
Journal of Parallel and Distributed Computing
Adaptive parallel I/O scheduling algorithm for multiprogrammed systems
Future Generation Computer Systems - Parallel input/output management techniques (PIOMT) in cluster and grid computing
Hi-index | 0.00 |
Switch-based clusters -- Network of Workstations/PCs connected by commodity switches, have been an appealing vehicle for high-performance computing. Despite their attractive features, cluster systems still have some limits when compared with traditional massively parallel machines. First, cluster systems usually have limited number of processing nodes, making full utilization of the computing power provided by each processing node a critical issue. Secondly, cluster systems are usually constructed with slower interconnects, making the network speed, not the disk speed, the limiting factor for parallel I/O performance.The notion of part-time I/O is commonly used for I/O in clusters, where a subset of processing nodes become I/O nodes at I/O time and return to computation after finishing the I/O operation. Careful assignment of part-time I/O nodes is the key to overcoming the above two limiting factors. Prior work reported an optimal assignment strategy for cluster systems with shared-media interconnects, based on an optimization that minimizes total amount of remote data transfers in parallel I/O. In this paper, we show that load balance on the I/O nodes, not the total amount of remote data transfers, is the key optimization criteria for assigning part-time I/O nodes for switch-based clusters. We formulate the assignment problem as a weighed bipartite matching with the goal to balance workload on the I/O nodes. We then propose an O(n3over2m(logn + logm)) algorithm to find optimal solution for this problem, where n is the number of compute nodes and m the number of I/O nodes. Experimental results on a 16-node PC cluster and simulation results for larger clusters are reported.