Finding Regions of Interest in Large Scientific Datasets

Authors:
Rishi Rakesh Sinha;Marianne Winslett;Kesheng Wu
Affiliations:
Microsoft Corporation, Redmond,;Department of Computer Science, University of Illinois at Urbana-Champaign,;Lawrence Berkeley National Laboratory,
Venue:
SSDBM 2009 Proceedings of the 21st International Conference on Scientific and Statistical Database Management
Year:
2009

Citing 13
Cited 1

Bitmap index design and evaluation

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Query optimization for selections using bitmaps

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Space efficient bitmap indexing

Proceedings of the ninth international conference on Information and knowledge management
Encoded Bitmap Indexing for Data Warehouses

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Model 204 Architecture and Performance

Proceedings of the 2nd International Workshop on High Performance Transaction Systems
Design and Implementation of Bitmap Indices for Scientific Data

IDEAS '01 Proceedings of the International Database Engineering & Applications Symposium
Efficient Techniques for Range Search Queries on Earth Science Data

SSDBM '02 Proceedings of the 14th International Conference on Scientific and Statistical Database Management
Optimizing candidate check costs for bitmap indices

Proceedings of the 14th ACM international conference on Information and knowledge management
Using bitmap index for interactive exploration of large datasets

SSDBM '03 Proceedings of the 15th International Conference on Scientific and Statistical Database Management
Optimizing bitmap indices with efficient compression

ACM Transactions on Database Systems (TODS)
Approximate encoding for direct access and query processing over compressed bitmaps

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Multi-resolution bitmap indexes for scientific data

ACM Transactions on Database Systems (TODS)
On the performance of bitmap indices for high cardinality attributes

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30

Supporting web-based visual exploration of large-scale raster geospatial data using binned min-max Quadtree

SSDBM'10 Proceedings of the 22nd international conference on Scientific and statistical database management

Quantified Score

Hi-index	0.00

Visualization

Abstract

Consider a scientific range query, such as find all places in Africa where yesterday the temperature was over 35 degrees and it rained . In theory, one can answer such queries by returning all geographic points that satisfy the query condition. However, in practice, users do not find this low-level answer very useful; instead they require the points to be consolidated into regions, i.e., sets of points that all satisfy the query conditions and are adjacent in the underlying mesh. In this paper, we show that when a high-quality index is used to find the points and a good traditional connected component labeling algorithm is used to build the regions, the cost of consolidating the points into regions dominates range query response time. We then show how to find query result points and consolidate them into regions in expected time that is sublinear in the number of result points. This seemingly miraculous speedup comes from a point lookup phase that uses bitmap indexes and produces a compressed bitmap as the intermediate query result, followed by a region consolidation phase that operates directly on the intermediate query result bitmap and exploits the spatial properties of the underlying mesh to greatly reduce the cost of consolidating the result points into regions. Our experiments with real-world scientific data demonstrate that in practice, our approach to region consolidation is over 10 times faster than a traditional connected component algorithm.