High performance messaging on workstations: Illinois fast messages (FM) for Myrinet
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Server-directed collective I/O in Panda
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Input/output characteristics of scalable parallel applications
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
File-Access Characteristics of Parallel Scientific Workloads
IEEE Transactions on Parallel and Distributed Systems
Global arrays: a nonuniform memory access programming model for high-performance computers
The Journal of Supercomputing
Optimizing collective I/O performance on parallel computers: a multisystem study
ICS '97 Proceedings of the 11th international conference on Supercomputing
Low-latency communication on the IBM RISC system/6000 SP
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Disk Resident Arrays: An Array-Oriented I/O Library for Out-Of-Core Computations
FRONTIERS '96 Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
MPI: A Message-Passing Interface
MPI: A Message-Passing Interface
Hi-index | 0.00 |
A cost-effective secondary storage architecture for parallel computers is to distribute storage across all processors, which then engage in either computation or I/O, depending on the demands of the moment. A difficulty associated with this architecture is that access to storage on another processor typically requires the cooperation of that processor, which can be hard to arrange if the processor is engaged in other computation. One partial solution to this problem is to require that remote I/O operations occur only via collective calls. In this paper, we describe an alternative approach based on the use of single-sided communication operations such as Active Messages. We present an implementation of this basic approach called Distant I/O and present experimental results that quantify the low-level performance of DIO mechanisms. This technique is exploited to support noncollective parallel shared file model for a large out-of-core scientific application with very high I/O bandwidth requirements. The achieved performance exceeds by a wide margin the performance of a well equipped PIOFS parallel filesystem on the IBM SP.