Multidimensional Indexing and Query Coordination for Tertiary Storage Management

  • Authors:
  • A. Shoshani;L. M. Bernardo;H. Nordberg;D. Rotem;A. Sim

  • Affiliations:
  • -;-;-;-;-

  • Venue:
  • SSDBM '99 Proceedings of the 11th International Conference on Scientific and Statistical Database Management
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

In many scientific domains, experimental devices or simulation programs generate large volumes of data. The volumes of data may reach hundreds of terabytes and therefore it is impractical to store them on disk systems. Rather they are stored on robotic tape systems that are managed by some mass storage system (MSS). A major bottleneck in analyzing the simulated/collected data is the retrieval of subsets from the tertiary storage system. In this paper we describe the architecture and implementation of a Storage Access Coordination System (STACS) designed to optimize the use of a disk cache, and thus minimize the number of files read from tape. We achieve this by using a specialized index to locate the relevant data on tapes, and by coordinating file caching over multiple queries.We focus on a specific application area, a high energy physics data management and analysis environment. STACS was implemented and is being incorporated in an operational system, scheduled to go on-line in the end of 1999. We also include the results of various tests that demonstrate the benefits and efficiency gained of using the STACS.