Query Selectivity Estimation for Uncertain Data

  • Authors:
  • Sarvjeet Singh;Chris Mayfield;Rahul Shah;Sunil Prabhakar;Susanne Hambrusch

  • Affiliations:
  • Department of Computer Science, Purdue University, West Lafayette, USA IN 47907;Department of Computer Science, Purdue University, West Lafayette, USA IN 47907;Department of Computer Science, Purdue University, West Lafayette, USA IN 47907;Department of Computer Science, Purdue University, West Lafayette, USA IN 47907;Department of Computer Science, Purdue University, West Lafayette, USA IN 47907

  • Venue:
  • SSDBM '08 Proceedings of the 20th international conference on Scientific and Statistical Database Management
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Applications requiring the handling of uncertain data have led to the development of database management systems extending the scope of relational databases to include uncertain (probabilistic) data as a native data type. New automatic query optimizations having the ability to estimate the cost of execution of a given query plan, as available in existing databases, need to be developed. For probabilistic data this involves providing selectivity estimations that can handle multiple values for each attribute and also new query types with threshold values. This paper presents novel selectivity estimation functions for uncertain data and shows how these functions can be integrated into PostgreSQL to achieve query optimization for probabilistic queries over uncertain data. The proposed methods are able to handle both attribute- and tuple-uncertainty. Our experimental results show that our algorithms are efficient and give good selectivity estimates with low space-time overhead.