Top-k join queries: overcoming the curse of anti-correlation

Authors:
Manish Patil;Rahul Shah;Sharma V. Thankachan
Affiliations:
Louisiana State University, Baton Rouge, LA;Louisiana State University, Baton Rouge, LA;Louisiana State University, Baton Rouge, LA
Venue:
Proceedings of the 17th International Database Engineering & Applications Symposium
Year:
2013

Citing 14
Cited 0

Combining fuzzy information from multiple systems

Journal of Computer and System Sciences
Optimal aggregation algorithms for middleware

PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Minimal probing: supporting expensive predicates for top-k queries

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Optimizing Multi-Feature Queries for Image Databases

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Supporting Incremental Join Queries on Ranked Inputs

Proceedings of the 27th International Conference on Very Large Data Bases
Query Processing Issues in Image(Multimedia) Databases

ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Efficient Aggregation of Ranked Inputs

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Adaptive rank-aware query optimization in relational databases

ACM Transactions on Database Systems (TODS)
Branch-and-bound processing of ranked queries

Information Systems
Supporting top-K join queries in relational databases

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Practical Entropy-Bounded Schemes for O(1)-Range Minimum Queries

DCC '08 Proceedings of the Data Compression Conference
Robust and efficient algorithms for rank join evaluation

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Space-Efficient Framework for Top-k String Retrieval Problems

FOCS '09 Proceedings of the 2009 50th Annual IEEE Symposium on Foundations of Computer Science
Towards an optimal space-and-query-time index for top-k document retrieval

CPM'12 Proceedings of the 23rd Annual conference on Combinatorial Pattern Matching

Quantified Score

Hi-index	0.00

Visualization

Abstract

The existing heuristics for top-k join queries, aiming to minimize the scan-depth, rely heavily on scores and correlation of scores. It is known that for uniformly random scores between two relations of length n, scan-depth of √kn is required. Moreover, optimizing multiple criteria of selections that are anti-correlated may require scan-depth up to (n + k)/2. We build a linear space index, which in anticipation of worst-case queries maintains a subset of answers. Based on this, we achieve Õ(√kn) join trials i.e., average case performance even for the worst-case queries. The experimental evaluation shows superior performance against the well-known Rank-Join algorithm.