Querying and cleaning uncertain data

Authors:
Reynold Cheng
Affiliations:
Department of Computer Science, The University of Hong Kong, Hong Kong
Venue:
QuaCon'09 Proceedings of the 1st international conference on Quality of context
Year:
2009

Citing 22
Cited 0

Evaluating probabilistic queries over imprecise data

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Approximate Selection Queries over Imprecise Data

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Querying Imprecise Data in Moving Object Environments

IEEE Transactions on Knowledge and Data Engineering
Indexing multi-dimensional uncertain data with arbitrary probability density functions

VLDB '05 Proceedings of the 31st international conference on Very large data bases
A Mathematical Theory of Communication

A Mathematical Theory of Communication
Working Models for Uncertain Data

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
The Gauss-Tree: Efficient Object Identification in Databases of Probabilistic Feature Vectors

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
ULDBs: databases with uncertainty and lineage

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Efficient join processing over uncertain data

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Range search on multidimensional uncertain data

ACM Transactions on Database Systems (TODS)
A Logical Formulation of Probabilistic Spatial Databases

IEEE Transactions on Knowledge and Data Engineering
Model-driven data acquisition in sensor networks

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Efficient query evaluation on probabilistic databases

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Efficient indexing methods for probabilistic threshold queries over uncertain data

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Probabilistic skylines on uncertain data

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Query language support for incomplete information in the MayBMS system

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Cleaning uncertain data with quality guarantees

Proceedings of the VLDB Endowment
Evaluating probability threshold k-nearest-neighbor queries over uncertain data

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Probabilistic Verifiers: Evaluating Constrained Nearest-Neighbor Queries over Uncertain Data

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Database Support for Probabilistic Attributes and Tuples

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Probabilistic nearest-neighbor query on uncertain objects

DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
Probabilistic spatial queries on existentially uncertain data

SSTD'05 Proceedings of the 9th international conference on Advances in Spatial and Temporal Databases

Quantified Score

Hi-index	0.00

Visualization

Abstract

The management of uncertainty in large databases has recently attracted tremendous research interest. Data uncertainty is inherent in many emerging and important applications, including locationbased services, wireless sensor networks, biometric and biological databases, and data stream applications. In these systems, it is important to manage data uncertainty carefully, in order to make correct decisions and provide high-quality services to users. To enable the development of these applications, uncertain database systems have been proposed. They consider data uncertainty as a "first-class citizen", and use generic data models to capture uncertainty, as well as provide query operators that return answers with statistical confidences. We summarize our work on uncertain databases in recent years. We explain how data uncertainty can be modeled, and present a classification of probabilistic queries (e.g., range query and nearest-neighbor query). We further study how probabilistic queries can be efficiently evaluated and indexed. We also highlight the issue of removing uncertainty under a stringent cleaning budget, with an attempt of generating high-quality probabilistic answers.