Procedures for optimization problems with a mixture of bounds and general linear constraints
ACM Transactions on Mathematical Software (TOMS)
On selecting a satisfying truth assignment (extended abstract)
SFCS '91 Proceedings of the 32nd annual symposium on Foundations of computer science
U-DBMS: a database system for managing constantly-evolving data
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Clean Answers over Dirty Databases: A Probabilistic Approach
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Trio: a system for data, uncertainty, and lineage
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Leveraging aggregate constraints for deduplication
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Efficient query evaluation on probabilistic databases
The VLDB Journal — The International Journal on Very Large Data Bases
Query language support for incomplete information in the MayBMS system
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
MCDB: a monte carlo approach to managing uncertain data
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Conditioning probabilistic databases
Proceedings of the VLDB Endowment
Fast and Simple Relational Processing of Uncertain Data
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
A Sampling-Based Approach to Information Recovery
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Uncertainty management in rule-based information extraction systems
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Risky business: modeling and exploiting uncertainty in information retrieval
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
A Markov Chain Monte Carlo Sampler for Mixed Boolean/Integer Constraints
CAV '09 Proceedings of the 21st International Conference on Computer Aided Verification
Constraint-based entity matching
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Counting CSP solutions using generalized XOR constraints
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Domain-independent extensions to GSAT: solving large structured satisfiability problems
IJCAI'93 Proceedings of the 13th international joint conference on Artifical intelligence - Volume 1
The trichotomy of HAVING queries on a probabilistic database
The VLDB Journal — The International Journal on Very Large Data Bases
Leveraging spatio-temporal redundancy for RFID data cleansing
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
A new method for solving hard satisfiability problems
AAAI'92 Proceedings of the tenth national conference on Artificial intelligence
k-nearest neighbors in uncertain graphs
Proceedings of the VLDB Endowment
Distance-constraint reachability computation in uncertain graphs
Proceedings of the VLDB Endowment
Aggregation in probabilistic databases via knowledge compilation
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
Data uncertainty arises in many situations. A common approach to query processing uncertain data is to sample many "possible worlds" from the uncertain data and to run queries against the possible worlds. However, sampling is not a trivial task, as a randomly sampled possible world may not satisfy known constraints imposed on the data. In this paper, we focus on an important category of constraints, the aggregate constraints. An aggregate constraint is placed on a set of records instead of on a single record, and a real-life system usually has a large number of aggregate constraints. It is a challenging task to find qualified possible worlds in this scenario, since tuple by tuple sampling is extremely inefficient because it rarely leads to a qualified possible world. In this paper, we introduce two approaches for querying uncertain data with aggregate constraints: constraint aware sampling and MCMC sampling. Our experiments show that the new approaches lead to high quality query results with reasonable cost.