Data Management for Large-Scale Scientific Computations in High Performance Distributed Systems

Authors:
A. Choudhary;M. Kandemir;H. Nagesh;J. No;X. Shen;V. Taylor;S. More;R. Thakur
Affiliations:
-;-;-;-;-;-;-;-
Venue:
HPDC '99 Proceedings of the 8th IEEE International Symposium on High Performance Distributed Computing
Year:
1999

Citing 0
Cited 4

Integrating parallel file I/O and database support for high-performance scientific data management

Proceedings of the 2000 ACM/IEEE conference on Supercomputing
An Efficient Algorithm for Large-Scale Matrix Transposition

ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
A high-performance distributed parallel file system for data-intensive computations

Journal of Parallel and Distributed Computing
Bitmap indexes for large scientific data sets: a case study

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

With the increasing number of scientific applications manipulating huge amounts of data, effective data management is an increasingly important problem. Unfortunately, so far the solutions to this data management problem either require deep understanding of specific storage architectures and file layouts (as in high-performance file systems) or produce unsatisfactory I/O performance in exchange for ease-of-use and portability (as in relational DBMSs).In this paper we present a new environment which is built around an active meta-data management system (MDMS). The key components of our three-tiered architecture are user application, the MDMS, and a hierarchical storage system (HSS). Our environment overcomes the performance problems of pure database-oriented solutions, while maintaining their advantages in terms of ease-of-use and portability.The high levels of performance are achieved by the MDMS, with the aid of user-specified directives. Our environment supports a simple, easy-to-use yet powerful user interface, leaving the task of choosing appropriate I/O techniques to the MDMS. We discuss the importance of an active MDMS and show how the three components, namely application, the MDMS, and the HSS, fit together. We also report performance numbers from our initial implementation and illustrate that significant improvements are made possible without undue programming effort.