Incomplete Information in Relational Databases
Journal of the ACM (JACM)
On the representation and querying of sets of possible worlds
SIGMOD '87 Proceedings of the 1987 ACM SIGMOD international conference on Management of data
Randomized algorithms
A probabilistic relational model and algebra
ACM Transactions on Database Systems (TODS)
Optimal aggregation algorithms for middleware
PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Imprecise and Uncertain Information in Databases: An Evidential Approach
Proceedings of the Eighth International Conference on Data Engineering
Fast probabilistic algorithms for hamiltonian circuits and matchings
STOC '77 Proceedings of the ninth annual ACM symposium on Theory of computing
Evaluating probabilistic queries over imprecise data
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
MYSTIQ: a system for finding more answers by using probabilities
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Indexing multi-dimensional uncertain data with arbitrary probability density functions
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Working Models for Uncertain Data
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
A Sampling-Based Approach to Optimizing Top-k Queries in Sensor Networks
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Continuous monitoring of top-k queries over sliding windows
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Answering top-k queries using views
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
ULDBs: databases with uncertainty and lineage
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Management of probabilistic data: foundations and challenges
Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
The dichotomy of conjunctive queries on probabilistic structures
Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Range search on multidimensional uncertain data
ACM Transactions on Database Systems (TODS)
Efficient query evaluation on probabilistic databases
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Efficient indexing methods for probabilistic threshold queries over uncertain data
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Probabilistic skylines on uncertain data
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Probabilistic ranked queries in uncertain databases
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Ranking queries on uncertain data: a probabilistic threshold approach
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Efficient Processing of Top-k Queries in Uncertain Databases with x-Relations
IEEE Transactions on Knowledge and Data Engineering
Efficiently Answering Probabilistic Threshold Top-k Queries on Uncertain Data
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Efficient Processing of Top-k Queries in Uncertain Databases
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Fast and Simple Relational Processing of Uncertain Data
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Semantics of Ranking Queries for Probabilistic Data and Expected Ranks
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Consensus answers for queries over probabilistic databases
Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Ranking distributed probabilistic data
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Semantics and evaluation of top-k queries in probabilistic databases
Distributed and Parallel Databases
Efficient processing of probabilistic set-containment queries on uncertain set-valued data
Information Sciences: an International Journal
MUD: Mapping-based query processing for high-dimensional uncertain data
Information Sciences: an International Journal
Top-k best probability queries and semantics ranking properties on probabilistic databases
Data & Knowledge Engineering
Hi-index | 0.00 |
Uncertain data is inherent in a few important applications. It is far from trivial to extend ranking queries (also known as top-k queries), a popular type of queries on certain data, to uncertain data. In this paper, we cast ranking queries on uncertain data using three parameters: rank threshold k, probability threshold p, and answer set size threshold l. Systematically, we identify four types of ranking queries on uncertain data. First, a probability threshold top-k query computes the uncertain records taking a probability of at least p to be in the top-k list. Second, a top-(k, l) query returns the top-l uncertain records whose probabilities of being ranked among top-k are the largest. Third, the p-rank of an uncertain record is the smallest number k such that the record takes a probability of at least p to be ranked in the top-k list. A rank threshold top-k query retrieves the records whose p-ranks are at most k. Last, a top-(p, l) query returns the top-l uncertain records with the smallest p-ranks. To answer such ranking queries, we present an efficient exact algorithm, a fast sampling algorithm, and a Poisson approximation-based algorithm. To answer top-(k, l) queries and top-(p, l) queries, we propose PRist+, a compact index. An efficient index construction algorithm and efficacious query answering methods are developed for PRist+. An empirical study using real and synthetic data sets verifies the effectiveness of the probabilistic ranking queries and the efficiency of our methods.