Practical selectivity estimation through adaptive sampling
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Federated database systems for managing distributed, heterogeneous, and autonomous databases
ACM Computing Surveys (CSUR) - Special issue on heterogeneous databases
Adaptive selectivity estimation using query feedback
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Balancing histogram optimality and practicality for query result size estimation
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Bifocal sampling for skew-resistant join size estimation
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Improved histograms for selectivity estimation of range predicates
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Random sampling for histogram construction: how much is enough?
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Approximating block accesses in database organizations
Communications of the ACM
Analysis and performance of inverted data base structures
Communications of the ACM
Proceedings of the tenth international conference on Information and knowledge management
Estimating Bucket Accesses: A Practical Approach
Proceedings of the Second International Conference on Data Engineering
Optimizing Queries Across Diverse Data Sources
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Reducing the Braking Distance of an SQL Query Engine
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Computing Iceberg Queries Efficiently
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Evaluating Top-k Selection Queries
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Probabilistic Optimization of Top N Queries
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
On multi-query optimization
Hi-index | 0.00 |
Abstract--New application domains cause today's database sizes to grow rapidly, posing great demands on technology. Data fragmentation facilitates techniques (like distribution, parallelization, and main-memory computing) meeting these demands. Also, fragmentation might help to improve efficient processing of query types such as top {\rm{N}}. Database design and query optimization require a good notion of the costs resulting from a certain fragmentation. Our mathematically derived selectivity model facilitates this. Once its two parameters have been computed based on the fragmentation, after each (though usually infrequent) update, our model can forget the data distribution, resulting in fast and quite good selectivity estimation. We show experimental verification for Zipfian distributed IR databases.