Improving Access to Multi-dimensional Self-describing Scientific Datasets

Authors:
Beomseok Nam;Alan Sussman
Affiliations:
-;-
Venue:
CCGRID '03 Proceedings of the 3st International Symposium on Cluster Computing and the Grid
Year:
2003

Citing 0
Cited 3

Flexible multi-dimensional indexing server for searching non-textual diagnostic annotations

EuroIMSA '08 Proceedings of the IASTED International Conference on Internet and Multimedia Systems and Applications
Managing and searching distributed multidimensional annotations with large scale image data

MCAM'07 Proceedings of the 2007 international conference on Multimedia content analysis and mining
An overview of the HDF5 technology suite and its applications

Proceedings of the EDBT/ICDT 2011 Workshop on Array Databases

Quantified Score

Hi-index	0.00

Visualization

Abstract

Applications that query into very large multi-dimensional datasets are becoming more common.Many self-describing scientific data file formats have alsoemerged, which have structural metadata to help navigatethe multi-dimensional arrays that are stored in the files.The files may also contain application-specific semanticmetadata. In this paper, we discuss efficient methodsfor performing searches for subsets of multi-dimensionaldata objects, sing semantic information to build multi-dimensional indexes, and group data items into properlysized chunks to maximize disk I/O bandwidth. This work isthe first step in the design and implementation of a genericindexing library that will work with various high-dimensionscientific data file formats containing semantic informationabout the stored data. To validate the approach, we haveimplemented indexing structures for NASA remote sensingdata stored in the HDF format with a specific schema(HDF-EOS), and show the performance improvements thatare gained from indexing the datasets, compared to usingthe existing HDF library for accessing the data.