Processing top-k join queries

Authors:
Minji Wu;Laure Berti-Équille;Amélie Marian;Cecilia M. Procopiuc;Divesh Srivastava
Affiliations:
Rutgers University;University of Rennes;Rutgers University;AT&T Labs-Research;AT&T Labs-Research
Venue:
Proceedings of the VLDB Endowment
Year:
2010

Citing 12
Cited 5

Supporting Incremental Join Queries on Ranked Inputs

Proceedings of the 27th International Conference on Very Large Data Bases
Optimal aggregation algorithms for middleware

Journal of Computer and System Sciences - Special issu on PODS 2001
Evaluating top-k queries over web-accessible databases

ACM Transactions on Database Systems (TODS)
Management of probabilistic data: foundations and challenges

Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Maximally joining probabilistic data

Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Supporting top-K join queries in relational databases

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Top-k query evaluation with probabilistic guarantees

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
TopX: efficient and versatile top-k query processing for semistructured data

The VLDB Journal — The International Journal on Very Large Data Bases
Monte-Carlo algorithms for enumeration and reliability problems

SFCS '83 Proceedings of the 24th Annual Symposium on Foundations of Computer Science
Efficient Processing of Top-k Queries in Uncertain Databases

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Recommending Join Queries via Query Log Analysis

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
A unified approach to ranking in probabilistic databases

Proceedings of the VLDB Endowment

Top-k linked data query processing

ESWC'12 Proceedings of the 9th international conference on The Semantic Web: research and applications
Being picky: processing top-k queries with set-defined selections

Proceedings of the 21st ACM international conference on Information and knowledge management
TJJE: An efficient algorithm for top-k join on massive data

Information Sciences: an International Journal
Efficient Top-k Keyword Search Over Multidimensional Databases

International Journal of Data Warehousing and Mining
Using a real-time top-k algorithm to mine the most frequent items over multiple streams

ICIC'13 Proceedings of the 9th international conference on Intelligent Computing Theories

Quantified Score

Hi-index	0.00

Visualization

Abstract

We consider the problem of efficiently finding the top-k answers for join queries over web-accessible databases. Classical algorithms for finding top-k answers use branch-and-bound techniques to avoid computing scores of all candidates in identifying the top-k answers. To be able to apply such techniques, it is critical to efficiently compute (lower and upper) bounds and expected scores of candidate answers in an incremental fashion during the evaluation. In this paper, we describe novel techniques for these problems. The first contribution of this paper is a method to efficiently compute bounds for the score of a query result when tuples in tables from the "FROM" clause are discovered incrementally, through either sorted or random access. Our second contribution is an algorithm that, given a set of partially evaluated candidate answers, determines a good order in which to access the tables to minimize wasted efforts in the computation of top-k answers. We evaluate our algorithms on a variety of queries and data sets and demonstrate the significant benefits they provide.