The merge/purge problem for large databases
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
ProbView: a flexible probabilistic database system
ACM Transactions on Database Systems (TODS)
The complexity of query reliability
PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Foundations of Databases: The Logical Level
Foundations of Databases: The Logical Level
The Management of Probabilistic Data
IEEE Transactions on Knowledge and Data Engineering
Scalar aggregation in inconsistent databases
Theoretical Computer Science - Database theory
Evaluating probabilistic queries over imprecise data
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Aggregate operators in probabilistic databases
Journal of the ACM (JACM)
Working Models for Uncertain Data
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Integrating Unstructured Data into Relational Databases
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Creating probabilistic databases from information extraction models
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Trio: a system for data, uncertainty, and lineage
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
OLAP over uncertain and imprecise data
The VLDB Journal — The International Journal on Very Large Data Bases
Management of probabilistic data: foundations and challenges
Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Efficient aggregation algorithms for probabilistic data
SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
Efficient query evaluation on probabilistic databases
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Materialized views in probabilistic databases: for information exchange and query optimization
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
First-order query rewriting for inconsistent databases
ICDT'05 Proceedings of the 10th international conference on Database Theory
Management of data with uncertainties
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Query efficiency in probabilistic XML models
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Incorporating constraints in probabilistic XML
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Query evaluation with soft-key constraints
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Managing Probabilistic Data with MystiQ: The Can-Do, the Could-Do, and the Can't-Do
SUM '08 Proceedings of the 2nd international conference on Scalable Uncertainty Management
Modeling and querying probabilistic XML data
ACM SIGMOD Record
On Query Algebras for Probabilistic Databases
ACM SIGMOD Record
Probabilistic databases: diamonds in the dirt
Communications of the ACM - Barbara Liskov: ACM's A.M. Turing Award Winner
Incorporating constraints in probabilistic XML
ACM Transactions on Database Systems (TODS)
Aggregate queries for discrete and continuous probabilistic XML
Proceedings of the 13th International Conference on Database Theory
Proceedings of the 13th International Conference on Database Theory
Journal of the ACM (JACM)
Capturing continuous data and answering aggregate queries in probabilistic XML
ACM Transactions on Database Systems (TODS)
PReach: Reachability in Probabilistic Signaling Networks
Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics
Hi-index | 0.00 |
We study the evaluation of positive conjunctive queries with Boolean aggregate tests (similar to HAVING queries in SQL) on probabilistic databases. Our motivation is to handle aggregate queries over imprecise data resulting from information integration or information extraction. More precisely, we study conjunctive queries with predicate aggregates using MIN, MAX, COUNT, SUM, AVG or COUNT(DISTINCT) on probabilistic databases. Computing the precise output probabilities for positive conjunctive queries (without HAVING) is #P-hard, but is in P for a restricted class of queries called safe queries. Further, for queries without self-joins either a query is safe or its data complexity is #P-Hard, which shows that safe queries exactly capture tractable queries without self-joins. In this paper, for each aggregate above, we find a class of queries that exactly capture efficient evaluation for HAVING queries without self-joins. Our algorithms use a novel technique to compute the marginal distributions of elements in a semiring, which may be of independent interest.