Top-k best probability queries and semantics ranking properties on probabilistic databases

Authors:
Trieu Minh Nhut Le;Jinli Cao;Zhen He
Affiliations:
-;-;-
Venue:
Data & Knowledge Engineering
Year:
2013

Citing 29
Cited 0

On the representation and querying of sets of possible worlds

SIGMOD '87 Proceedings of the 1987 ACM SIGMOD international conference on Management of data
The Skyline Operator

Proceedings of the 17th International Conference on Data Engineering
Learning Probabilistic Relational Models

SARA '02 Proceedings of the 4th International Symposium on Abstraction, Reformulation, and Approximation
An optimal and progressive algorithm for skyline queries

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Evaluating probabilistic queries over imprecise data

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Aggregate operators in probabilistic databases

Journal of the ACM (JACM)
Knowledge discovery by probabilistic clustering of distributed databases

Data & Knowledge Engineering
Working Models for Uncertain Data

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Probabilistic skylines on uncertain data

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Probabilistic ranked queries in uncertain databases

EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Ranking queries on uncertain data: a probabilistic threshold approach

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Query answering techniques on uncertain and probabilistic data: tutorial summary

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Probabilistic top-k and ranking-aggregate queries

ACM Transactions on Database Systems (TODS)
A survey of top-k query processing techniques in relational database systems

ACM Computing Surveys (CSUR)
Managing Uncertain Data: Probabilistic Approaches

WAIM '08 Proceedings of the 2008 The Ninth International Conference on Web-Age Information Management
Sliding-window top-k queries on uncertain streams

Proceedings of the VLDB Endowment
Generating efficient safe query plans for probabilistic databases

Data & Knowledge Engineering
Efficient Processing of Top-k Queries in Uncertain Databases with x-Relations

IEEE Transactions on Knowledge and Data Engineering
A Survey of Uncertain Data Algorithms and Applications

IEEE Transactions on Knowledge and Data Engineering
Semantics of Ranking Queries for Probabilistic Data and Expected Ranks

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Computing all skyline probabilities for uncertain data

Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Top-k queries on uncertain data: on score distribution and typical answers

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Semantics and evaluation of top-k queries in probabilistic databases

Distributed and Parallel Databases
Ranking the sky: Discovering the importance of skyline points through subspace dominance relationships

Data & Knowledge Engineering
Ranking queries on uncertain data

The VLDB Journal — The International Journal on Very Large Data Bases
A unified approach to ranking in probabilistic databases

The VLDB Journal — The International Journal on Very Large Data Bases
Robust ranking of uncertain data

DASFAA'11 Proceedings of the 16th international conference on Database systems for advanced applications - Volume Part I
Shooting top-k stars in uncertain databases

The VLDB Journal — The International Journal on Very Large Data Bases
Top-k best probability queries on probabilistic data

DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

There has been much interest in answering top-k queries on probabilistic data in various applications such as market analysis, personalized services, and decision making. In probabilistic relational databases, the most common problem in answering top-k queries (ranking queries) is selecting the top-k result based on scores and top-k probabilities. In this paper, we firstly propose novel answers to top-k best probability queries by selecting the probabilistic tuples which have not only the best top-k scores but also the best top-k probabilities. An efficient algorithm for top-k best probability queries is introduced without requiring users to define a threshold. The top-k best probability approach is more efficient and effective than the probability threshold approach (PT-k) [1,2]. Second, we add the ''k-best ranking score'' into the set of semantic properties for ranking queries on uncertain data proposed by [3,4]. Then, our proposed method is analyzed, which meets the semantic ranking properties on uncertain data. In addition, it proves that the answers to the top-k best probability queries overcome drawbacks of previous definitions of the top-k queries on probabilistic data in terms of semantic ranking properties. Lastly, we conduct an extensive experimental study verifying the effectiveness of answers to the top-k best probability queries compared to PT-k queries on uncertain data and the efficiency of our algorithm against the state-of-the-art execution of the PT-k algorithm using both real and synthetic data sets.