Data vaults: a database welcome to scientific file repositories

  • Authors:
  • Milena Ivanova;Yağiz Kargin;Martin Kersten;Stefan Manegold;Ying Zhang;Mihai Datcu;Daniela Espinoza Molina

  • Affiliations:
  • Netherlands eScience Center;CWI Amsterdam;CWI Amsterdam;CWI Amsterdam;CWI Amsterdam;DLR;DLR

  • Venue:
  • Proceedings of the 25th International Conference on Scientific and Statistical Database Management
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Efficient management and exploration of high-volume scientific file repositories have become pivotal for advancement in science. We propose to demonstrate the Data Vault, an extension of the database system architecture that transparently opens scientific file repositories for efficient in-database processing and exploration. The Data Vault facilitates science data analysis using high-level declarative languages, such as the traditional SQL and the novel array-oriented SciQL. Data of interest are loaded from the attached repository in a just-in-time manner without need for up-front data ingestion. The demo is built around concrete implementations of the Data Vault for two scientific use cases: seismic time series and Earth observation images. The seismic Data Vault uses the queries submitted by the audience to illustrate the internals of Data Vault functioning by revealing the mechanisms of dynamic query plan generation and on-demand external data ingestion. The image Data Vault shows an application view from the perspective of data mining researchers.