Database Support for Probabilistic Attributes and Tuples

  • Authors:
  • Sarvjeet Singh;Chris Mayfield;Rahul Shah;Sunil Prabhakar;Susanne Hambrusch;Jennifer Neville;Reynold Cheng

  • Affiliations:
  • Department of Computer Science, Purdue University, West Lafayette, Indiana, USA. sarvjeet@cs.purdue.edu;Department of Computer Science, Purdue University, West Lafayette, Indiana, USA. cmayfiel@cs.purdue.edu;Department of Computer Science, Louisiana State University, Baton Rouge, Louisiana, USA. rahul@csc.lsu.edu;Department of Computer Science, Purdue University, West Lafayette, Indiana, USA. sunil@cs.purdue.edu;Department of Computer Science, Purdue University, West Lafayette, Indiana, USA. seh@cs.purdue.edu;Department of Computer Science, Purdue University, West Lafayette, Indiana, USA. neville@cs.purdue.edu;Department of Computing, Hong Kong Polytechnic University, Kowloon, Hong Kong, China. csckcheng@comp.polyu.edu.hk

  • Venue:
  • ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The inherent uncertainty of data present in numerous applications such as sensor databases, text annotations, and information retrieval motivate the need to handle imprecise data at the database level. Uncertainty can be at the attribute or tuple level and is present in both continuous and discrete data domains. This paper presents a model for handling arbitrary probabilistic uncertain data (both discrete and continuous) natively at the database level. Our approach leads to a natural and efficient representation for probabilistic data. We develop a model that is consistent with possible worlds semantics and closed under basic relational operators. This is the first model that accurately and efficiently handles both continuous and discrete uncertainty. The model is implemented in a real database system (PostgreSQL) and the effectiveness and efficiency of our approach is validated experimentally.