Equi-depth multidimensional histograms
SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
BMPD statistical software manual
BMPD statistical software manual
Statistical profile estimation in database systems
ACM Computing Surveys (CSUR)
A model of data distribution based on texture analysis
SIGMOD '85 Proceedings of the 1985 ACM SIGMOD international conference on Management of data
A detailed statistical model for relational query optimization
ACM '85 Proceedings of the 1985 ACM annual conference on The range of computing : mid-80's perspective: mid-80's perspective
Query Optimization in Database Systems
ACM Computing Surveys (CSUR)
Analysis and performance of inverted data base structures
Communications of the ACM
Access path selection in a relational database management system
SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
Accurate estimation of the number of tuples satisfying a condition
SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
The optimization of queries in relational databases
The optimization of queries in relational databases
Estimating selectivities in data bases
Estimating selectivities in data bases
A Hybrid Estimator for Selectivity Estimation
IEEE Transactions on Knowledge and Data Engineering
Hi-index | 0.00 |
An approach to estimating record selectivity rooted in the theory of fitting a hierarchy of models in discrete data analysis is presented. In contrast to parametric methods, this approach does not presuppose a distribution pattern to which the actual data conform; it searches for one that fits the actual data. This approach makes use of parsimonious models wherever appropriate in order to minimize the storage requirement without sacrificing accuracy. Two-dimensional cases are used as examples to illustrate the proposed method. It is demonstrated that the technique of identifying a good-fitting and parsimonious model can drastically reduce storage space and that the implementation of this technique requires little extra processing effort. The case of perfect or near-perfect association and the idea of keeping information about salient cells of a table are discussed. A strategy to reduce storage requirement in cases in which a good-fitting and parsimonious model is not available is proposed. Hierarchical models for three-dimensional cases are presented, along with a description of the W.E. Deming and F.F. Stephan (1940) iterative proportional fitting algorithm which fits hierarchical models of any dimensions.