Statistical estimators for relational algebra expressions

  • Authors:
  • Wen-Chi Hou;Gultekin Ozsoyoglu;Baldeo K. Taneja

  • Affiliations:
  • Department of Computer Engineering and Science and Center for Automation and Intelligent Systems, Case Western Reserve University, Cleveland, Ohio;Department of Computer Engineering and Science and Center for Automation and Intelligent Systems, Case Western Reserve University, Cleveland, Ohio;Department of Computer Engineering and Science and Center for Automation and Intelligent Systems, Case Western Reserve University, Cleveland, Ohio and Department of Mathematics and Statistics, Cas ...

  • Venue:
  • Proceedings of the seventh ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
  • Year:
  • 1988

Quantified Score

Hi-index 0.00

Visualization

Abstract

Present database systems process all the data related to a query before giving out responses. As a result, the size of the data to be processed becomes excessive for real-time/time-constrained environments. A new methodology is needed to cut down systematically the time to process the data involved in processing the query. To this end, we propose to use data samples and construct an approximate synthetic response to a given query.In this paper, we consider only COUNT(E) type queries, where E is an arbitrary relational algebra expression. We make no assumptions about the distribution of attribute values and ordering of tuples in the input relations, and propose consistent and unbiased estimators for arbitrary COUNT(E) type queries. We design a sampling plan based on the cluster sampling method to improve the utilization of sampled data and to reduce the cost of sampling. We also evaluate the performance of the proposed estimators.