Distributed databases principles and systems
Distributed databases principles and systems
On estimating access costs in relational databases
Information Processing Letters
The effect of join selectives on optimal nesting order
ACM SIGMOD Record
On estimating the cardinality of the projection of a database relation
ACM Transactions on Database Systems (TODS)
Statistical profile estimation in database systems
ACM Computing Surveys (CSUR)
SIGMOD '89 Proceedings of the 1989 ACM SIGMOD international conference on Management of data
Index scans using a finite LRU buffer: a validated I/O model
ACM Transactions on Database Systems (TODS)
On the effect of join operations on relation sizes
ACM Transactions on Database Systems (TODS)
Optimization Strategies for Relational Queries
IEEE Transactions on Software Engineering
A linear-time probabilistic counting algorithm for database applications
ACM Transactions on Database Systems (TODS)
Estimating the size of relational SP J operation results: an analytical approach
Information Systems
A note on estimating the cardinality of the projection of a database relation
ACM Transactions on Database Systems (TODS)
Statistical estimators for aggregate relational algebra queries
ACM Transactions on Database Systems (TODS)
On the propagation of errors in the size of join results
SIGMOD '91 Proceedings of the 1991 ACM SIGMOD international conference on Management of data
On the complexity of finding bounds for projection cardinalities in relational databases
Information Systems - Data bases: their creation, management, and utilization
Optimal histograms for limiting worst-case error propagation in the size of join results
ACM Transactions on Database Systems (TODS)
Access cost estimation for physical database design
Data & Knowledge Engineering
The art of computer programming, volume 1 (3rd ed.): fundamental algorithms
The art of computer programming, volume 1 (3rd ed.): fundamental algorithms
Implications of certain assumptions in database performance evauation
ACM Transactions on Database Systems (TODS)
A model of data distribution based on texture analysis
SIGMOD '85 Proceedings of the 1985 ACM SIGMOD international conference on Management of data
Query optimization in star computer networks
ACM Transactions on Database Systems (TODS)
Duplicate record elimination in large data files
ACM Transactions on Database Systems (TODS)
Query Optimization in Database Systems
ACM Computing Surveys (CSUR)
Access path selection in a relational database management system
SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
Block Access Estimation for Clustered Data Using a Finite LRU Buffer
IEEE Transactions on Software Engineering
Block Access Estimation for Clustered Data
IEEE Transactions on Knowledge and Data Engineering
VLDB '88 Proceedings of the 14th International Conference on Very Large Data Bases
Bounding the cardinality of aggregate views through domain-derived constraints
Data & Knowledge Engineering - Special issue: Advances in OLAP
Estimating the output cardinality of partial preaggregation with a measure of clusteredness
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Analytic-based estimation of query result sizes
AIKED'05 Proceedings of the 4th WSEAS International Conference on Artificial Intelligence, Knowledge Engineering Data Bases
Efficient derivation of numerical dependencies
Information Systems
Hi-index | 0.00 |
Database optimizers require statistical information about data distributions in order to evaluate result sizes and access plan costs for processing user queries. In this context, we consider the problem of estimating the size of the projections of a database relation, when measures on attribute domain cardinalities are maintained in the system. Our main theoretical contribution is a new formal model (AD), valid under the hypotheses of attribute independence and uniform distribution of attribute values, derived considering the difference between time-invariant domain (the set of values that an attribute can assume) and time-dependent 驴active domain驴 (the set of values that are actually assumed, at a certain time). Early models developed under the same assumptions are shown to be formally incorrect. Since the AD model is computationally high-demanding, we also introduce an approximate, easy-to-compute model (A2D) that, unlike previous approximations, yields low errors on all the parameter space of the active domain cardinalities. Finally, we extend the A2D model to the case of nonuniform distributions and present experimental results confirming the good behavior of the model.