Variable Interactions in Query-Driven Visualization
IEEE Transactions on Visualization and Computer Graphics
High performance multivariate visual data exploration for extremely large data
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Armazenamento distribuído de imagens médicas DICOM no formato de dados HDF5
Proceedings of the 14th Brazilian Symposium on Multimedia and the Web
An architecture for DICOM medical images storage and retrieval adopting distributed file systems
International Journal of High Performance Systems Architecture
FastQuery: a general indexing and querying system for scientific data
SSDBM'11 Proceedings of the 23rd international conference on Scientific and statistical database management
Parallel index and query for large scale data analysis
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Hi-index | 0.00 |
Large scale scientific data is often stored in scientific data formats such as FITS, netCDF and HDF. These storage formats are of particular interest to the scientific user community since they provide multi-dimensional storage and retrieval. However, one of the drawbacks of these storage formats is that they do not support semantic indexing which is important for interactive data analysis where scientists look for features of interests such as "Find all supernova explosions where energy 10^5 and temperature 10^6". In this paper we present a novel approach called HDF5- FastQuery to accelerate the data access of large HDF5 files by introducing multi-dimensional semantic indexing. Our implementation leverages an efficient indexing technology called bitmap indexing that has been widely used in the database community. Bitmap indices are especially well suited for interactive exploration of large-scale readonly data. Storing the bitmap indices into the HDF5 file has the following advantages: a) Significant performance speedup of accessing subsets of multi-dimensional data and b) portability of the indices across multiple computer platforms. We will present an API that simplifies the execution of queries on HDF5 files for general scientific applications and data analysis. The design is flexible enough to accommodate the use of arbitrary indexing technology for semantic range queries. We will also provide a detailed performance analysis of HDF5-FastQuery for both synthetic and scientific data. The results demonstrate that our proposed approach for multi-dimensional queries is up to a factor of 2 faster than HDF5.