A Selectivity Model for Fragmented Relations: Applied in Information Retrieval

Authors:
Henk Ernst Blok;Sunil Choenni;Henk M. Blanken;Peter M. G. Apers
Affiliations:
-;-;-;-
Venue:
IEEE Transactions on Knowledge and Data Engineering
Year:
2004

Citing 18
Cited 0

Practical selectivity estimation through adaptive sampling

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Federated database systems for managing distributed, heterogeneous, and autonomous databases

ACM Computing Surveys (CSUR) - Special issue on heterogeneous databases
Adaptive selectivity estimation using query feedback

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Balancing histogram optimality and practicality for query result size estimation

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Bifocal sampling for skew-resistant join size estimation

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Improved histograms for selectivity estimation of range predicates

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Random sampling for histogram construction: how much is enough?

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
On random sampling over joins

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Approximating block accesses in database organizations

Communications of the ACM
Analysis and performance of inverted data base structures

Communications of the ACM
Predicting the cost-quality trade-off for information retrieval queries: facilitating database design and query optimization

Proceedings of the tenth international conference on Information and knowledge management
Estimating Bucket Accesses: A Practical Approach

Proceedings of the Second International Conference on Data Engineering
Optimizing Queries Across Diverse Data Sources

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Reducing the Braking Distance of an SQL Query Engine

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Computing Iceberg Queries Efficiently

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Evaluating Top-k Selection Queries

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Probabilistic Optimization of Top N Queries

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
On multi-query optimization

On multi-query optimization

Quantified Score

Hi-index	0.00

Visualization

Abstract

Abstract--New application domains cause today's database sizes to grow rapidly, posing great demands on technology. Data fragmentation facilitates techniques (like distribution, parallelization, and main-memory computing) meeting these demands. Also, fragmentation might help to improve efficient processing of query types such as top {\rm{N}}. Database design and query optimization require a good notion of the costs resulting from a certain fragmentation. Our mathematically derived selectivity model facilitates this. Once its two parameters have been computed based on the fragmentation, after each (though usually infrequent) update, our model can forget the data distribution, resulting in fast and quite good selectivity estimation. We show experimental verification for Zipfian distributed IR databases.