Server-directed collective I/O in Panda

Authors:
K. E. Seamons;Y. Chen;P. Jones;J. Jozwiak;M. Winslett
Affiliations:
Center for Advanced Database Research, Computer Science Department, University of Illinois, Urbana, Illinois;Center for Advanced Database Research, Computer Science Department, University of Illinois, Urbana, Illinois;Center for Advanced Database Research, Computer Science Department, University of Illinois, Urbana, Illinois;Center for Advanced Database Research, Computer Science Department, University of Illinois, Urbana, Illinois;Center for Advanced Database Research, Computer Science Department, University of Illinois, Urbana, Illinois
Venue:
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Year:
1995

Citing 12
Cited 91

Design and Evaluation of primitives for Parallel I/O

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Applications-driven parallel I/O

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Extensible file system (ELFS): an object-oriented approach to high performance file I/O

OOPSLA '94 Proceedings of the ninth annual conference on Object-oriented programming systems, language, and applications
Dynamic file-access characteristics of a production parallel scientific workload

Proceedings of the 1994 ACM/IEEE conference on Supercomputing
An efficient abstract interface for multidimensional array I/O

Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Language, compiler and parallel database support for I/O intensive applications

HPCN Europe '95 Proceedings of the International Conference and Exhibition on High-Performance Computing and Networking
Physical Schemas for Large Multidimensional Arrays in Scientific Computing Applications

Proceedings of the Seventh International Working Conference on Scientific and Statistical Database Management
High-Level Fault Tolerance in Distributed Programs

High-Level Fault Tolerance in Distributed Programs
Throughput of Existing Multiprocessor File Systems (An Informal Study)

Throughput of Existing Multiprocessor File Systems (An Informal Study)
Disk-directed I/O for an Out-of-Core Computation

Disk-directed I/O for an Out-of-Core Computation
Expanding the Potential for Disk-Directed I/O

Expanding the Potential for Disk-Directed I/O
Characterizing Parallel File-Access Patterns on a Large-Scale Multiprocessor

Characterizing Parallel File-Access Patterns on a Large-Scale Multiprocessor

Flexibility and performance of parallel file systems

ACM SIGOPS Operating Systems Review
HFS: a performance-oriented flexible file system based on building-block compositions

Proceedings of the fourth workshop on I/O in parallel and distributed systems: part of the federated computing research conference
Scalable message passing in Panda

Proceedings of the fourth workshop on I/O in parallel and distributed systems: part of the federated computing research conference
The galley parallel file system

ICS '96 Proceedings of the 10th international conference on Supercomputing
Strategic directions in storage I/O issues in large-scale computing

ACM Computing Surveys (CSUR) - Special ACM 50th-anniversary issue: strategic directions in computing research
Disk-directed I/O for MIMD multiprocessors

ACM Transactions on Computer Systems (TOCS)
HFS: a performance-oriented flexible file system based on building-block compositions

ACM Transactions on Computer Systems (TOCS)
Implementation of collective I/O in the Intel Paragon parallel file system: initial experiences

ICS '97 Proceedings of the 11th international conference on Supercomputing
Optimizing collective I/O performance on parallel computers: a multisystem study

ICS '97 Proceedings of the 11th international conference on Supercomputing
Exploiting local data in parallel array I/O on a practical network of workstations

Proceedings of the fifth workshop on I/O in parallel and distributed systems
Remote I/O: fast access to distant storage

Proceedings of the fifth workshop on I/O in parallel and distributed systems
Automatic parallel I/O performance optimization in Panda

Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
On implementing MPI-IO portably and with high performance

Proceedings of the sixth workshop on I/O in parallel and distributed systems
Efficient input and output for scientific simulations

Proceedings of the sixth workshop on I/O in parallel and distributed systems
The impact of spatial layout of jobs on parallel I/O performance

Proceedings of the sixth workshop on I/O in parallel and distributed systems
Querying very large multi-dimensional datasets in ADR

SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Informed prefetching of collective input/output requests

SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Sun MPII/O: efficient I/O for parallel applications

SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
An evaluation of Java's I/O capabilities for high-performance computing

Proceedings of the ACM 2000 conference on Java Grande
Automated Tuning of Parallel I/O Systems: An Approach to Portable I/O Performance for Scientific Applications

IEEE Transactions on Software Engineering - Special issue on architecture-independent languages and software tools parallel processing
Performance modeling for the panda array I/O library

Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
Integrating parallel file I/O and database support for high-performance scientific data management

Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Tuning high-performance scientific codes: the use of performance models to control resource usage during data migration and I/O

ICS '01 Proceedings of the 15th international conference on Supercomputing
Compiler-Directed Collective-I/O

IEEE Transactions on Parallel and Distributed Systems
A case for using MPI's derived datatypes to improve I/O performance

SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
PDS/PIO: lightweight libraries for collective parallel I/O

SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Parallel simulation of parallel file systems and I/O programs

SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
Active buffering plus compressed migration: an integrated solution to parallel simulations' data transport needs

ICS '02 Proceedings of the 16th international conference on Supercomputing
An Experimental Evaluation of I/O Optimizations on Different Applications

IEEE Transactions on Parallel and Distributed Systems
Implementing noncollective parallel I/O in cluster environments using Active Message communication

Cluster Computing
Parallel data intensive computing in scientific and commercial applications

Parallel Computing - Parallel data-intensive algorithms and applications
An Experimental Evaluation of I/O Optimizations on Different Applications

IEEE Transactions on Parallel and Distributed Systems
Mapping Functions and Data Redistribution for Parallel Files

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Enhancing Data Migration Performance via Parallel Data Compression

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Faster Collective Output through Active Buffering

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
A Scientific Data Management System for Irregular Applications

IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Design and Evaluation of a Compiler-Directed Collective I/O Technique

Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
A Collective I/O Scheme Based on Compiler Analysis

LCR '00 Selected Papers from the 5th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
Towards Portable Runtime Support for Irregular and Out-of-Core Computations

Proceedings of the 6th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Placement of I/O servers to improve parallel I/O performance on switch-based clusters

ICS '03 Proceedings of the 17th annual international conference on Supercomputing
An Abstract-Device Interface for Implementing Portable Parallel-I/O Interfaces

FRONTIERS '96 Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation
PMPIO - A Portable Implementation of MPI-IO

FRONTIERS '96 Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation
Disk Resident Arrays: An Array-Oriented I/O Library for Out-Of-Core Computations

FRONTIERS '96 Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation
High-performance scientific data management system

Journal of Parallel and Distributed Computing
References

Sourcebook of parallel computing
A distributed multi-storage I/O system for data intensive scientific computing

Parallel Computing - Special issue: Parallel and distributed scientific and engineering computing
Integrating collective I/O and cooperative caching into the "clusterfile" parallel file system

Proceedings of the 18th annual international conference on Supercomputing
Distributed Scheduling of Parallel I/O in the Presence of Data Replication

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Parallel I/O

International Journal of High Performance Computing Applications
Exploiting Inter-File Access Patterns Using Multi-Collective I/O

FAST '02 Proceedings of the 1st USENIX Conference on File and Storage Technologies
Design of a next generation sampling service for large scale data analysis applications

Proceedings of the 19th annual international conference on Supercomputing
A study of I/O methods for parallel visualization of large-scale data

Parallel Computing - Parallel graphics and visualization
The Globus Striped GridFTP Framework and Server

SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
The impact of spatial layout of jobs on I/O hotspots in mesh networks

Journal of Parallel and Distributed Computing - Special issue: Design and performance of networks for super-, cluster-, and grid-computing: Part I
Discretionary Caching for I/O on Clusters

Cluster Computing
High-Level Buffering for Hiding Periodic Output Cost in Scientific Simulations

IEEE Transactions on Parallel and Distributed Systems
Multicollective I/O: A technique for exploiting inter-file access patterns

ACM Transactions on Storage (TOS)
Scalable Design and Implementations for MPI Parallel Overlapping I/O

IEEE Transactions on Parallel and Distributed Systems
Coupling prefix caching and collective downloads for remote dataset access

Proceedings of the 20th annual international conference on Supercomputing
Improving I/O performance of applications through compiler-directed code restructuring

FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
Dynamically adapting file domain partitioning methods for collective I/O based on underlying parallel file system locking protocols

Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Comparative evaluation of overlap strategies with study of I/O overlap in MPI-IO

ACM SIGOPS Operating Systems Review
A collective I/O implementation based on inspector---executor paradigm

The Journal of Supercomputing
Optimizing server placement for parallel I/O in switch-based clusters

Journal of Parallel and Distributed Computing
Data Locality Aware Strategy for Two-Phase Collective I/O

High Performance Computing for Computational Science - VECPAR 2008
An implementation of parallel file distribution in an agent hierarchy

The Journal of Supercomputing
DataStager: scalable data staging services for petascale applications

Proceedings of the 18th ACM international symposium on High performance distributed computing
Implementation and Evaluation of File Write-Back and Prefetching for MPI-IO Over GPFS

International Journal of High Performance Computing Applications
A Scalable Message Passing Interface Implementation of an Ad-Hoc Parallel I/o system

International Journal of High Performance Computing Applications
InterferenceRemoval: removing interference of disk access for MPI programs through data replication

Proceedings of the 24th ACM International Conference on Supercomputing
DataStager: scalable data staging services for petascale applications

Cluster Computing
DataSpaces: an interaction and coordination framework for coupled simulation workflows

Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
A layout-aware optimization strategy for collective I/O

Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
IOrchestrator: Improving the Performance of Multi-node I/O Systems via Inter-Server Coordination

Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Functional Partitioning to Optimize End-to-End Performance on Many-core Architectures

Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
ViMPIOS, a "truly" portable MPI-IO implementation

EURO-PDP'00 Proceedings of the 8th Euromicro conference on Parallel and distributed processing
I/O conscious algorithm design and systems support for data analysis on emerging architectures

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Exploiting inter-file access patterns using multi-collective I/O

FAST'02 Proceedings of the 1st USENIX conference on File and storage technologies
The impact of applications' I/O strategies on the performance of the Lustre parallel file system

International Journal of High Performance Systems Architecture
A cost-intelligent application-specific data layout scheme for parallel file systems

Proceedings of the 20th international symposium on High performance distributed computing
Improving the average response time in collective I/O

EuroMPI'11 Proceedings of the 18th European MPI Users' Group conference on Recent advances in the message passing interface
Server-side I/O coordination for parallel file systems

Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Dynamic compilation for reducing energy consumption of i/o-intensive applications

LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing
Towards scalable I/O architecture for exascale systems

Proceedings of the 2011 ACM international workshop on Many task computing on grids and supercomputers
DataSpaces: an interaction and coordination framework for coupled simulation workflows

Cluster Computing
Efficient I/O for parallel visualization

EG PGV'11 Proceedings of the 11th Eurographics conference on Parallel Graphics and Visualization
Memory-conscious collective I/O for extreme scale HPC systems

Proceedings of the 3rd International Workshop on Runtime and Operating Systems for Supercomputers
Improving collective I/O performance by pipelining request aggregation and file access

Proceedings of the 20th European MPI Users' Group Meeting
Cost-intelligent application-specific data layout optimization for parallel file systems

Cluster Computing
Asynchronous object storage with QoS for scientific and commercial big data

PDSW '13 Proceedings of the 8th Parallel Data Storage Workshop
Trilinos I/O Support Trios

Scientific Programming - A New Overview of the Trilinos Project --Part 1

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present the architecture and implementation results for Panda 2.0, a library for input and output of multidimensional arrays on parallel and sequential platforms. Panda achieves remarkable performance levels on the IBM SP2, showing excellent scalability as data size increases and as the number of nodes increases, and provides throughputs close to the full capacity of the AIX file system on the SP2 we used. We argue that this good performance can be traced to Panda's use of server-directed i/o (a logical-level version of disk-directed i/o [Kotz94b]) to perform array i/o using sequential disk reads and writes, a very high level interface for collective i/o requests, and built-in facilities for arbitrary rearrangements of arrays during i/o. Other advantages of Panda's approach are ease of use, easy application portability, and a reliance on commodity system software.