A Contingency Approach to Estimating Record Selectivities

  • Authors:
  • Pai-Cheng Chu

  • Affiliations:
  • Ohio State Univ., Columbus

  • Venue:
  • IEEE Transactions on Software Engineering
  • Year:
  • 1991

Quantified Score

Hi-index 0.00

Visualization

Abstract

An approach to estimating record selectivity rooted in the theory of fitting a hierarchy of models in discrete data analysis is presented. In contrast to parametric methods, this approach does not presuppose a distribution pattern to which the actual data conform; it searches for one that fits the actual data. This approach makes use of parsimonious models wherever appropriate in order to minimize the storage requirement without sacrificing accuracy. Two-dimensional cases are used as examples to illustrate the proposed method. It is demonstrated that the technique of identifying a good-fitting and parsimonious model can drastically reduce storage space and that the implementation of this technique requires little extra processing effort. The case of perfect or near-perfect association and the idea of keeping information about salient cells of a table are discussed. A strategy to reduce storage requirement in cases in which a good-fitting and parsimonious model is not available is proposed. Hierarchical models for three-dimensional cases are presented, along with a description of the W.E. Deming and F.F. Stephan (1940) iterative proportional fitting algorithm which fits hierarchical models of any dimensions.