The design and analysis of spatial data structures
The design and analysis of spatial data structures
Data & Knowledge Engineering
Proceedings of the Ninth International Conference on Data Engineering
A Raster Approximation For Processing of Spatial Joins
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Large-Sample and Deterministic Confidence Intervals for Online Aggregation
SSDBM '97 Proceedings of the Ninth International Conference on Scientific and Statistical Database Management
Evaluating probabilistic queries over imprecise data
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Filtering with raster signatures
GIS '06 Proceedings of the 14th annual ACM international symposium on Advances in geographic information systems
Efficient join processing over uncertain data
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
An incremental refining spatial join algorithm for estimating query results in GIS
DEXA'06 Proceedings of the 17th international conference on Database and Expert Systems Applications
Hi-index | 0.00 |
Many Geographic Information Systems (GIS) handle large geospatial datasets stored in raster representation. Spatial joins over raster data are important queries in GIS for data analysis and decision support. However, evaluating spatial joins can be very time intensive due to the size of these datasets. In this paper we propose a new interactive framework that allows users to get approximate answers in near instantaneous time, thus allowing for truly interactive data exploration. Our method utilizes two proposed statistical approaches: probabilistic join and sampling based join. Our probabilistic join method provides speedup of two orders of magnitude with no correctness guarantee, while our sampling based method provides an order of magnitude improvement over the full quad-tree join and also provides running confidence intervals. We propose a framework that combines the two approaches to allow end users to tradeoff speed versus bounded accuracy. The two approaches are evaluated empirically with real and synthetic datasets.