Exploring spatial datasets with histograms

Authors:
Chengyu Sun;Nagender Bandi;Divyakant Agrawal;Amr El Abbadi
Affiliations:
Department of Computer Science, California State University, Los Angeles;Department of Computer Science, University of California, Santa Barbara;Department of Computer Science, University of California, Santa Barbara;Department of Computer Science, University of California, Santa Barbara
Venue:
Distributed and Parallel Databases
Year:
2006

Citing 12
Cited 2

Symbolic and Geometric Connectivity Graph Methods for Route Planning in Digitized Maps

IEEE Transactions on Pattern Analysis and Machine Intelligence
Range queries in OLAP data cubes

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Multidimensional access methods

ACM Computing Surveys (CSUR)
Selectivity estimation in spatial databases

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Language and Spatial Cognition

Language and Spatial Cognition
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals

Data Mining and Knowledge Discovery
Qualitative representation of spatial knowledge in two-dimensional space

The VLDB Journal — The International Journal on Very Large Data Bases - Spatial Database Systems
Hierarchical Prefix Cubes for Range-Sum Queries

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
The Geometry of Browsing

LATIN '98 Proceedings of the Third Latin American Symposium on Theoretical Informatics
pCube: Update-Efficient Online Aggregation with Progressive Feedback and Error Bounds

SSDBM '00 Proceedings of the 12th International Conference on Scientific and Statistical Database Management
Accurate Estimation of the Cost of Spatial Selections

ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Analyzing Range Queries on Spatial Data

ICDE '00 Proceedings of the 16th International Conference on Data Engineering

Querying about the Past, the Present, and the Future in Spatio-Temporal Databases

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Privacy-aware collection of aggregate spatial data

Data & Knowledge Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

As online spatial datasets grow both in number and sophistication, it becomes increasingly difficult for users to decide whether a dataset is suitable for their tasks, especially when they do not have prior knowledge of the dataset. In this paper, we propose browsing as an effective and efficient way to explore the content of a spatial dataset. Browsing allows users to view the size of a result set before evaluating the query at the database, thereby avoiding zero-hit/mega-hit queries and saving time and resources. Although the underlying technique supporting browsing is similar to range query aggregation and selectivity estimation, spatial dataset browsing poses some unique challenges. In this paper, we identify a set of spatial relations that need to be supported in browsing applications, namely, the contains, contained and the overlap relations. We prove a lower bound on the storage required to answer queries about the contains relation accurately at a given resolution. We then present three storage-efficient approximation algorithms which we believe to be the first to estimate query results about these spatial relations. We evaluate these algorithms with both synthetic and real world datasets and show that they provide highly accurate estimates for datasets with various characteristics.