Scientific data management in the coming decade
ACM SIGMOD Record
An architecture for recycling intermediates in a column-store
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
SciQL: bridging the gap between science and relational DBMS
Proceedings of the 15th Symposium on International Database Engineering & Applications
NoDB: efficient query execution on raw data files
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Data vaults: a symbiosis between database technology and scientific file repositories
SSDBM'12 Proceedings of the 24th international conference on Scientific and Statistical Database Management
NoDB in action: adaptive query processing on raw data
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
Efficient management and exploration of high-volume scientific file repositories have become pivotal for advancement in science. We propose to demonstrate the Data Vault, an extension of the database system architecture that transparently opens scientific file repositories for efficient in-database processing and exploration. The Data Vault facilitates science data analysis using high-level declarative languages, such as the traditional SQL and the novel array-oriented SciQL. Data of interest are loaded from the attached repository in a just-in-time manner without need for up-front data ingestion. The demo is built around concrete implementations of the Data Vault for two scientific use cases: seismic time series and Earth observation images. The seismic Data Vault uses the queries submitted by the audience to illustrate the internals of Data Vault functioning by revealing the mechanisms of dynamic query plan generation and on-demand external data ingestion. The image Data Vault shows an application view from the perspective of data mining researchers.