Approximate counting, uniform generation and rapidly mixing Markov chains
Information and Computation
Randomized algorithms
The merge/purge problem for large databases
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
ProbView: a flexible probabilistic database system
ACM Transactions on Database Systems (TODS)
The complexity of query reliability
PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
The Management of Probabilistic Data
IEEE Transactions on Knowledge and Data Engineering
Consistent Answers from Integrated Data Sources
FQAS '02 Proceedings of the 5th International Conference on Flexible Query Answering Systems
Scalar aggregation in inconsistent databases
Theoretical Computer Science - Database theory
The complexity of relational query languages (Extended Abstract)
STOC '82 Proceedings of the fourteenth annual ACM symposium on Theory of computing
Robust and efficient fuzzy match for online data cleaning
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Evaluating probabilistic queries over imprecise data
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Reasoning about knowledge and probability
TARK '88 Proceedings of the 2nd conference on Theoretical aspects of reasoning about knowledge
Aggregate operators in probabilistic databases
Journal of the ACM (JACM)
OLAP over uncertain and imprecise data
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Working Models for Uncertain Data
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Integrating Unstructured Data into Relational Databases
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Clean Answers over Dirty Databases: A Probabilistic Approach
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Creating probabilistic databases from information extraction models
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Trio: a system for data, uncertainty, and lineage
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
OLAP over uncertain and imprecise data
The VLDB Journal — The International Journal on Very Large Data Bases
Management of probabilistic data: foundations and challenges
Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
The dichotomy of conjunctive queries on probabilistic structures
Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Efficient aggregation algorithms for probabilistic data
SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
Efficient query evaluation on probabilistic databases
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Materialized views in probabilistic databases: for information exchange and query optimization
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
MCDB: a monte carlo approach to managing uncertain data
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Approximating predicates and expressive queries on probabilistic databases
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Incorporating constraints in probabilistic XML
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Monte-Carlo algorithms for enumeration and reliability problems
SFCS '83 Proceedings of the 24th Annual Symposium on Foundations of Computer Science
Conditioning probabilistic databases
Proceedings of the VLDB Endowment
A compositional query algebra for second-order logic and uncertain databases
Proceedings of the 12th International Conference on Database Theory
Fast and Simple Relational Processing of Uncertain Data
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Online Filtering, Smoothing and Probabilistic Modeling of Streaming data
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
SPROUT: Lazy vs. Eager Query Plans for Tuple-Independent Probabilistic Databases
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
First-order query rewriting for inconsistent databases
ICDT'05 Proceedings of the 10th international conference on Database Theory
Read-once functions and query evaluation in probabilistic databases
Proceedings of the VLDB Endowment
Conditioning and aggregating uncertain data streams: going beyond expectations
Proceedings of the VLDB Endowment
Tractability in probabilistic databases
Proceedings of the 14th International Conference on Database Theory
Querying uncertain data with aggregate constraints
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
The monte carlo database system: Stochastic analysis close to the data
ACM Transactions on Database Systems (TODS)
Aggregation in probabilistic databases via knowledge compilation
Proceedings of the VLDB Endowment
CLARO: modeling and processing uncertain data streams
The VLDB Journal — The International Journal on Very Large Data Bases
Anytime approximation in probabilistic databases
The VLDB Journal — The International Journal on Very Large Data Bases
Hi-index | 0.00 |
We study the evaluation of positive conjunctive queries with Boolean aggregate tests (similar to HAVING in SQL) on probabilistic databases. More precisely, we study conjunctive queries with predicate aggregates on probabilistic databases where the aggregation function is one of MIN, MAX, EXISTS, COUNT, SUM, AVG, or COUNT(DISTINCT) and the comparison function is one of =, 驴,驴,,驴, or 驴, and the comparison function, 驴. In this paper, we establish a set of trichotomy results for conjunctive queries with HAVING predicates parametrized by (驴, 驴). For such queries (without self-joins), one of the following three statements is true: (1) the exact evaluation problem has $${\mathcal P}$$ -time data complexity. In this case, we call the query safe. (2) The exact evaluation problem is $${{\sharp{\mathcal P}}}$$ -hard, but the approximate evaluation problem has (randomized) $${{\mathcal P}}$$ -time data complexity. More precisely, there exists an FPTRAS for the query. In this case, we call the query apx-safe. (3) The exact evaluation problem is $${{\sharp{\mathcal P}}}$$ -hard, and the approximate evaluation problem is also hard. We call these queries hazardous. The precise definition of each class depends on the aggregate considered and the comparison function. Thus, we have queries that are (MAX,驴 )-safe, (COUNT,驴 )-apx-safe, (SUM,=)-hazardous, etc. Our trichotomy result is a significant extension of a previous dichotomy result for Boolean conjunctive queries into safe and not safe. For each of the three classes we present novel techniques. For safe queries, we describe an evaluation algorithm that uses random variables over semirings. For apx-safe queries, we describe an FPTRAS that relies on a novel algorithm for generating a random possible world satisfying a given condition. Finally, for hazardous queries we give novel proofs of hardness of approximation. The results for safe queries were previously announced (in Ré, C., Suciu, D. Efficient evaluation of. In: DBPL, pp. 186---200, 2007), but all other results are new.