The R*-tree: an efficient and robust access method for points and rectangles
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
The hB-tree: a multiattribute indexing method with good guaranteed performance
ACM Transactions on Database Systems (TODS)
Predicate migration: optimizing queries with expensive predicates
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Adaptive selectivity estimation using query feedback
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Optimization techniques for queries with expensive methods
ACM Transactions on Database Systems (TODS)
Enhanced nearest neighbour search on the R-tree
ACM SIGMOD Record
Self-tuning histograms: building histograms without looking at data
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Optimization of queries with user-defined predicates
ACM Transactions on Database Systems (TODS)
Towards self-tuning data placement in parallel database systems
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
A performance comparison of quadtree-based access methods for thematic maps
SAC '00 Proceedings of the 2000 ACM symposium on Applied computing - Volume 1
Optimizing multidimensional index trees for main memory access
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Independence is good: dependency-based histogram synopses for high-dimensional data
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
STHoles: a multidimensional workload-aware histogram
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Progressive approximate aggregate queries with a multi-resolution tree structure
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
The K-D-B-tree: a search structure for large multidimensional dynamic indexes
SIGMOD '81 Proceedings of the 1981 ACM SIGMOD international conference on Management of data
R-trees: a dynamic index structure for spatial searching
SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
A Hybrid Pointerless Representation of Quadtrees for Efficient Processing of Window Queries
IGIS '94 Proceedings of the International Workshop on Advanced Information Systems: Geographic Information Systems
Indexing the Distance: An Efficient Method to KNN Processing
Proceedings of the 27th International Conference on Very Large Data Bases
LEO - DB2's LEarning Optimizer
Proceedings of the 27th International Conference on Very Large Data Bases
Selectivity Estimation Without the Attribute Value Independence Assumption
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
The hB $^\Pi$-tree: a multi-attribute index supporting concurrency, recovery and node consolidation
The VLDB Journal — The International Journal on Very Large Data Bases
Contorting high dimensional data for efficient main memory KNN processing
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Fast multidimensional scaling through sampling, springs and interpolation
Information Visualization
Evolutionary techniques for updating query cost models in a dynamic multidatabase environment
The VLDB Journal — The International Journal on Very Large Data Bases
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
A quad-tree based multiresolution approach for two-dimensional summary data
SSDBM '03 Proceedings of the 15th International Conference on Scientific and Statistical Database Management
Query optimizer for spatial join operations
GIS '06 Proceedings of the 14th annual ACM international symposium on Advances in geographic information systems
Fast UDFs to compute sufficient statistics on large data sets exploiting caching and sampling
Data & Knowledge Engineering
Optimizing queries with expensive video predicates in cloud environment
Concurrency and Computation: Practice & Experience
Hi-index | 0.00 |
Query optimizers in object-relational database management systems typically require users to provide the execution cost models of user-defined functions (UDFs). Despite this need, however, there has been little work done to provide such a model. The existing approaches are static in that they require users to train the model a priori with pregenerated UDF execution cost data. Static approaches can not adapt to changing UDF execution patterns and thus degrade in accuracy when the UDF executions used for generating training data do not reflect the patterns of those performed during operation. This article proposes a new approach based on the recent trend of self-tuning DBMS by which the cost model is maintained dynamically and incrementally as UDFs are being executed online. In the context of UDF cost modeling, our approach faces a number of challenges, that is, it should work with limited memory, work with limited computation time, and adjust to the fluctuations in the execution costs (e.g., caching effect). In this article, we first provide a set of guidelines for developing techniques that meet these challenges, while achieving accurate and fast cost prediction with small overheads. Then, we present two concrete techniques developed under the guidelines. One is an instance-based technique based on the conventional k-nearest neighbor (KNN) technique which uses a multidimensional index like the R*-tree. The other is a summary-based technique which uses the quadtree to store summary values at multiple resolutions. We have performed extensive performance evaluations comparing these two techniques against existing histogram-based techniques and the KNN technique, using both real and synthetic UDFs/data sets. The results show our techniques provide better performance in most situations considered.