Data management for large-scale scientific computations in high performance distributed systems

Authors:
A. Choudhary;M. Kandemir;J. No;G. Memik;X. Shen;W. Liao;H. Nagesh;S. More;V. Taylor;R. Thakur;R. Stevens
Affiliations:
Center for Parallel and Distributed Computing, Department of Electrical and Computer Engineering, Northwestern University, Evanston, IL 60208, USA;Department of Computer Science and Engineering, The Pennsylvania State University, University Park, PA 16802, USA;Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL 60439, USA;Center for Parallel and Distributed Computing, Department of Electrical and Computer Engineering, Northwestern University, Evanston, IL 60208, USA;Center for Parallel and Distributed Computing, Department of Electrical and Computer Engineering, Northwestern University, Evanston, IL 60208, USA;Center for Parallel and Distributed Computing, Department of Electrical and Computer Engineering, Northwestern University, Evanston, IL 60208, USA;Center for Parallel and Distributed Computing, Department of Electrical and Computer Engineering, Northwestern University, Evanston, IL 60208, USA;Center for Parallel and Distributed Computing, Department of Electrical and Computer Engineering, Northwestern University, Evanston, IL 60208, USA;Center for Parallel and Distributed Computing, Department of Electrical and Computer Engineering, Northwestern University, Evanston, IL 60208, USA;Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL 60439, USA;Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL 60439, USA
Venue:
Cluster Computing
Year:
2000

Citing 20
Cited 1

A status report on research in transparent informed prefetching

ACM SIGOPS Operating Systems Review
The high performance storage system

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Design and Evaluation of primitives for Parallel I/O

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Parallel access to files in the Vesta file system

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
High-performance I/O for massively parallel computers: problems and prospects

Computer
Extensible file system (ELFS): an object-oriented approach to high performance file I/O

OOPSLA '94 Proceedings of the ninth annual conference on Object-oriented programming systems, language, and applications
A model and compilation strategy for out-of-core data parallel programs

PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Input/output characteristics of scalable parallel applications

Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
The design and implementation of SOLAR, a portable library for scalable out-of-core linear algebra computations

Proceedings of the fourth workshop on I/O in parallel and distributed systems: part of the federated computing research conference
Multidimensional array I/O in Panda 1.0

The Journal of Supercomputing
Out-of-core FFTs with parallel disks

ACM SIGMETRICS Performance Evaluation Review
Microprocessor file system interfaces

PDIS '93 Proceedings of the second international conference on Parallel and distributed information systems
Object-Relational DBMSs: Tracking the Next Great Wave

Object-Relational DBMSs: Tracking the Next Great Wave
Performance Implications of Architectural and Software Techniques on I/O-Intensive Applications

ICPP '98 Proceedings of the 1998 International Conference on Parallel Processing
An Experimental Evaluation of the Parallel I/O Systems of the IBM SP and Intel Paragon Using a Production Application

Proceedings of the Third International ACPC Conference with Special Emphasis on Parallel Databases and Parallel I/O: Parallel Computation
The SDSC storage resource broker

CASCON '98 Proceedings of the 1998 conference of the Centre for Advanced Studies on Collaborative research
Intelligent, adaptive file system policy selection

FRONTIERS '96 Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation
Data Sieving and Collective I/O in ROMIO

FRONTIERS '99 Proceedings of the The 7th Symposium on the Frontiers of Massively Parallel Computation
Application-controlled file caching policies

USTC'94 Proceedings of the USENIX Summer 1994 Technical Conference on USENIX Summer 1994 Technical Conference - Volume 1
Disk-directed I/O for MIMD multiprocessors

OSDI '94 Proceedings of the 1st USENIX conference on Operating Systems Design and Implementation

High performance I/O architectures and systems

The Journal of Supercomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

With the increasing number of scientific applications manipulating huge amounts of data, effective high-level data management is an increasingly important problem. Unfortunately, so far the solutions to the high-level data management problem either require deep understanding of specific storage architectures and file layouts (as in high-performance file storage systems) or produce unsatisfactory I/O performance in exchange for ease-of-use and portability (as in relational DBMSs). In this paper we present a novel application development environment which is built around an active meta-data management system (MDMS) to handle high-level data in an effective manner. The key components of our three-tiered architecture are user application, the MDMS, and a hierarchical storage system (HSS). Our environment overcomes the performance problems of pure database-oriented solutions, while maintaining their advantages in terms of ease-of-use and portability. The high levels of performance are achieved by the MDMS, with the aid of user-specified, performance-oriented directives. Our environment supports a simple, easy-to-use yet powerful user interface, leaving the task of choosing appropriate I/O techniques for the application at hand to the MDMS. We discuss the importance of an active MDMS and show how the three components of our environment, namely the application, the MDMS, and the HSS, fit together. We also report performance numbers from our ongoing implementation and illustrate that significant improvements are made possible without undue programming effort.