Equi-depth multidimensional histograms
SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
Practical selectivity estimation through adaptive sampling
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
The R*-tree: an efficient and robust access method for points and rectangles
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
A Validity Measure for Fuzzy Clustering
IEEE Transactions on Pattern Analysis and Machine Intelligence
CIKM '93 Proceedings of the second international conference on Information and knowledge management
Improved histograms for selectivity estimation of range predicates
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Selectivity estimation in spatial databases
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Indexing medium-dimensionality data in Oracle
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Approximating multi-dimensional aggregate range queries over real attributes
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
STHoles: a multidimensional workload-aware histogram
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Quadtree and R-tree indexes in oracle spatial: a comparison using GIS data
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
R-trees: a dynamic index structure for spatial searching
SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Sampling-Based Selectivity Estimation for Joins Using Augmented Frequent Value Statistics
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
On Rectangular Partitionings in Two Dimensions: Algorithms, Complexity, and Applications
ICDT '99 Proceedings of the 7th International Conference on Database Theory
Optimal Histograms with Quality Guarantees
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
The R+-Tree: A Dynamic Index for Multi-Dimensional Objects
VLDB '87 Proceedings of the 13th International Conference on Very Large Data Bases
Distinct Sampling for Highly-Accurate Answers to Distinct Values Queries and Event Reports
Proceedings of the 27th International Conference on Very Large Data Bases
Sampling-Based Estimation of the Number of Distinct Values of an Attribute
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
A Storage and Access Architecture for Efficient Query Processing in Spatial Database Systems
SSD '93 Proceedings of the Third International Symposium on Advances in Spatial Databases
How to Avoid Building DataBlades(r) That Know the Value of Everything and the Cost of Nothing
SSDBM '99 Proceedings of the 11th International Conference on Scientific and Statistical Database Management
Accurate Estimation of the Cost of Spatial Selections
ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Incorporating Updates in Domain Indexes: Experiences with Oracle Spatial R-trees
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
ISOMER: Consistent Histogram Construction Using Query Feedback
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
The history of histograms (abridged)
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Automated statistics collection in DB2 UDB
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Rk-hist: an r-tree based histogram for multi-dimensional selectivity estimation
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Efficient and scalable statistics gathering for large databases in Oracle 11g
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Hierarchically organized skew-tolerant histograms for geographic data objects
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Hi-index | 0.00 |
Oracle Spatial and Graph is a geographic information system (GIS) which provides users the ability to store spatial data alongside conventional data in Oracle. As a result of the coexistence of spatial and other data, we observe a trend towards users performing increasingly complex queries which involve spatial as well as non-spatial predicates. Accurate selectivity values, especially for queries with multiple predicates requiring joins among numerous tables, are essential for the database optimizer to determine a good execution plan. For queries involving spatial predicates, this requires that reasonably accurate statistics collection has been performed on the spatial data. For extensible data cartridges such as Oracle Spatial and Graph, the optimizer expects to receive accurate predicate selectivity and cost values from functions implemented within the data cartridge. Although statistics collection for spatial data has been researched in academia for a few years; to the best of our knowledge, this is the first work to present spatial statistics collection implementation details for a commercial GIS database. In this paper, we describe our experiences with implementation of statistics collection methods for complex geometry objects within Oracle Spatial and Graph. Firstly, we exemplify issues with previous partitioning-based algorithms in presence of complex geometry objects and suggest enhancements which resolve the issues. Secondly, we propose a main memory implementation which not only speeds up the disk-based partitioning algorithms but also utilizes existing R-tree indexes to provide surprisingly accurate selectivity estimates. Last but not the least, we provide extensive experimental results and an example study which displays the efficacy of our approach on Oracle query performance.