A probabilistic relational algebra for the integration of information retrieval and database systems
ACM Transactions on Information Systems (TOIS)
ProbView: a flexible probabilistic database system
ACM Transactions on Database Systems (TODS)
Rank aggregation methods for the Web
Proceedings of the 10th international conference on World Wide Web
Cumulated gain-based evaluation of IR techniques
ACM Transactions on Information Systems (TOIS)
SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Optimizing search engines using clickthrough data
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Evaluating probabilistic queries over imprecise data
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Learning to rank using gradient descent
ICML '05 Proceedings of the 22nd international conference on Machine learning
Working Models for Uncertain Data
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Clean Answers over Dirty Databases: A Probabilistic Approach
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Creating probabilistic databases from information extraction models
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Management of probabilistic data: foundations and challenges
Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Efficient query evaluation on probabilistic databases
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Data integration with uncertainty
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Ranking queries on uncertain data: a probabilistic threshold approach
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Event queries on correlated probabilistic streams
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
A survey of top-k query processing techniques in relational database systems
ACM Computing Surveys (CSUR)
Sliding-window top-k queries on uncertain streams
Proceedings of the VLDB Endowment
Conditioning probabilistic databases
Proceedings of the VLDB Endowment
Efficient search for the top-k probable nearest neighbors in uncertain databases
Proceedings of the VLDB Endowment
Learning to create data-integrating queries
Proceedings of the VLDB Endowment
Evaluating probability threshold k-nearest-neighbor queries over uncertain data
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Proceedings of the forty-first annual ACM symposium on Theory of computing
Efficient Processing of Top-k Queries in Uncertain Databases
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Probabilistic Verifiers: Evaluating Constrained Nearest-Neighbor Queries over Uncertain Data
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Semantics of Ranking Queries for Probabilistic Data and Expected Ranks
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Ef?cient Query Evaluation over Temporally Correlated Probabilistic Streams
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Consensus answers for queries over probabilistic databases
Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Top-k queries on uncertain data: on score distribution and typical answers
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Indexing correlated probabilistic databases
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Learning to Rank for Information Retrieval
Foundations and Trends in Information Retrieval
PrDB: managing and exploiting rich correlations in probabilistic databases
The VLDB Journal — The International Journal on Very Large Data Bases
Probabilistic nearest-neighbor query on uncertain objects
DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Approximation algorithms for diversified search ranking
ICALP'10 Proceedings of the 37th international colloquium conference on Automata, languages and programming: Part II
Ranking continuous probabilistic datasets
Proceedings of the VLDB Endowment
k-selection query over uncertain data
DASFAA'10 Proceedings of the 15th international conference on Database Systems for Advanced Applications - Volume Part I
Models for incomplete and probabilistic information
EDBT'06 Proceedings of the 2006 international conference on Current Trends in Database Technology
On pruning for top-k ranking in uncertain databases
Proceedings of the VLDB Endowment
Efficient fuzzy ranking queries in uncertain databases
Applied Intelligence
Finding top k most influential spatial facilities over uncertain objects
Proceedings of the 21st ACM international conference on Information and knowledge management
Applying weighted queries on probabilistic databases
Proceedings of the 21st ACM international conference on Information and knowledge management
A top-k filter for logic-based similarity conditions on probabilistic databases
ADBIS'12 Proceedings of the 16th East European conference on Advances in Databases and Information Systems
Top-k best probability queries and semantics ranking properties on probabilistic databases
Data & Knowledge Engineering
Hi-index | 0.01 |
Ranking is a fundamental operation in data analysis and decision support and plays an even more crucial role if the dataset being explored exhibits uncertainty. This has led to much work in understanding how to rank the tuples in a probabilistic dataset in recent years. In this article, we present a unified approach to ranking and top-k query processing in probabilistic databases by viewing it as a multi-criterion optimization problem and by deriving a set of features that capture the key properties of a probabilistic dataset that dictate the ranked result. We contend that a single, specific ranking function may not suffice for probabilistic databases, and we instead propose two parameterized ranking functions, called PRF 驴 and PRF e, that generalize or can approximate many of the previously proposed ranking functions. We present novel generating functions-based algorithms for efficiently ranking large datasets according to these ranking functions, even if the datasets exhibit complex correlations modeled using probabilistic and/xor trees or Markov networks. We further propose that the parameters of the ranking function be learned from user preferences, and we develop an approach to learn those parameters. Finally, we present a comprehensive experimental study that illustrates the effectiveness of our parameterized ranking functions, especially PRF e, at approximating other ranking functions and the scalability of our proposed algorithms for exact or approximate ranking.