Design and Evaluation of primitives for Parallel I/O
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Server-directed collective I/O in Panda
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
PPFS: a high performance portable parallel file system
ICS '95 Proceedings of the 9th international conference on Supercomputing
The Vesta parallel file system
ACM Transactions on Computer Systems (TOCS)
Disk-directed I/O for MIMD multiprocessors
ACM Transactions on Computer Systems (TOCS)
An extended two-phase method for accessing sections of out-of-core arrays
Scientific Programming
The Galley parallel file system
Parallel Computing - Special double issue: parallel I/O
On implementing MPI-IO portably and with high performance
Proceedings of the sixth workshop on I/O in parallel and distributed systems
Querying very large multi-dimensional datasets in ADR
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
A data intensive distributed computing architecture for “grid” applications
Future Generation Computer Systems - Special issue on high performance computing and networking Europe 1999
A case for using MPI's derived datatypes to improve I/O performance
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Using MPI-2: Advanced Features of the Message Passing Interface
Using MPI-2: Advanced Features of the Message Passing Interface
Proceedings of the Third International ACPC Conference with Special Emphasis on Parallel Databases and Parallel I/O: Parallel Computation
The SDSC storage resource broker
CASCON '98 Proceedings of the 1998 conference of the Centre for Advanced Studies on Collaborative research
Intelligent, adaptive file system policy selection
FRONTIERS '96 Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation
PMPIO - A Portable Implementation of MPI-IO
FRONTIERS '96 Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation
A Network-Aware Distributed Storage Cache for Data Intensive Environments
HPDC '99 Proceedings of the 8th IEEE International Symposium on High Performance Distributed Computing
Data Management for Large-Scale Scientific Computations in High Performance Distributed Systems
HPDC '99 Proceedings of the 8th IEEE International Symposium on High Performance Distributed Computing
Multidimensional Indexing and Query Coordination for Tertiary Storage Management
SSDBM '99 Proceedings of the 11th International Conference on Scientific and Statistical Database Management
Globalized Newton-Krylov-Schwarz algorithms and software for parallel implicit CFD
Globalized Newton-Krylov-Schwarz algorithms and software for parallel implicit CFD
Scalability in the XFS file system
ATEC '96 Proceedings of the 1996 annual conference on USENIX Annual Technical Conference
A Scientific Data Management System for Irregular Applications
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
High-performance scientific data management system
Journal of Parallel and Distributed Computing
GODIVA: Lightweight Data Management for Scientific Visualization Applications
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Enabling Ad Hoc Queries over Low-Level Scientific Data Sets
SSDBM 2009 Proceedings of the 21st International Conference on Scientific and Statistical Database Management
ROARS: a scalable repository for data intensive scientific computing
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Bitmap indexes for large scientific data sets: a case study
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Robotics and Computer-Integrated Manufacturing
Supporting User-Defined Subsetting and Aggregation over Parallel NetCDF Datasets
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
ROARS: a robust object archival system for data intensive scientific computing
Distributed and Parallel Databases
Hi-index | 0.00 |
Many scientific applications have large I/O requirements, in terms of both the size of data and the number of files or data sets. Management, storage, efficient access, and analysis of these data present an extremely challenging task. Traditionally, two different solutions are used for this problem: file I/O or databases. File I/O can provide high performance but is tedious to use with large numbers of files and large and complex data sets. Databases can be convenient, flexible, and powerful but do notperform and scale well for parallel supercomputing applications. We have developed a software system, called Scientific Data Manager (SDM), which aims to combine the good features of both file I/O and databases. SDM provides a high-level API to the user and, internally, uses a parallel file system to store real data and a database to store appreciation-related metadata. SDM takes advantage of various I/O optimizations available in MPI-IO, such as collective I/O and noncontiguous requests, in a manner that is transparent to the user. As a result, users can write and retrieve data with the performance of parallel file I/O, without having to bother with the details of actually performing file I/O.In this paper, we describe the design and implementation of SDM. With the help of two parallel application templates, ASTRO3D and an Euler solver, we illustrate how some of the design criteria affect performance.