Progressive and selective merge: computing top-k with ad-hoc ranking functions

Authors:
Dong Xin;Jiawei Han;Kevin C. Chang
Affiliations:
University of Illinois at Urbana-Champaign, Urbana, IL;University of Illinois at Urbana-Champaign, Urbana, IL;University of Illinois at Urbana-Champaign, Urbana, IL
Venue:
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Year:
2007

Citing 20
Cited 24

Join indices

ACM Transactions on Database Systems (TODS)
Efficient processing of spatial joins using R-trees

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
An overview of data warehousing and OLAP technology

ACM SIGMOD Record
Fuzzy queries in multimedia database systems

PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Incremental distance join algorithms for spatial databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Bottom-up computation of sparse and Iceberg CUBE

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Space/time trade-offs in hash coding with allowable errors

Communications of the ACM
Optimal aggregation algorithms for middleware

PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Database Systems: The Complete Book

Database Systems: The Complete Book
Combining fuzzy information: an overview

ACM SIGMOD Record
The Skyline Operator

Proceedings of the 17th International Conference on Data Engineering
Adaptive and Incremental Processing for Distance Join Queries

IEEE Transactions on Knowledge and Data Engineering
Top-k Spatial Joins

IEEE Transactions on Knowledge and Data Engineering
Progressive skyline computation in database systems

ACM Transactions on Database Systems (TODS) - Special Issue: SIGMOD/PODS 2003
KLEE: a framework for distributed top-k query algorithms

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Boolean + ranking: querying a database by k-constrained optimization

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Ranking objects based on relationships

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
CURE for cubes: cubing using a ROLAP engine

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Answering top-k queries with multi-dimensional selections: the ranking cube approach

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
IO-Top-k: index-access optimized top-k query processing

VLDB '06 Proceedings of the 32nd international conference on Very large data bases

Efficient online top-K retrieval with arbitrary similarity measures

EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Probabilistic ranked queries in uncertain databases

EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
A survey of top-k query processing techniques in relational database systems

ACM Computing Surveys (CSUR)
Optimizing Distributed Top-k Queries

WISE '08 Proceedings of the 9th international conference on Web Information Systems Engineering
On Top-k Search with No Random Access Using Small Memory

ADBIS '08 Proceedings of the 12th East European conference on Advances in Databases and Information Systems
Speeding Up the NRA Algorithm

SUM '08 Proceedings of the 2nd international conference on Scalable Uncertainty Management
Secure kNN computation on encrypted databases

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Distributed top-k aggregation queries at large

Distributed and Parallel Databases
PDA: A Flexible and Efficient Personal Decision Assistant

SSTD '09 Proceedings of the 11th International Symposium on Advances in Spatial and Temporal Databases
Processing top-N relational queries by learning

Journal of Intelligent Information Systems
Efficient processing of exact top-k queries over disk-resident sorted lists

The VLDB Journal — The International Journal on Very Large Data Bases
Supporting ranking queries on uncertain and incomplete data

The VLDB Journal — The International Journal on Very Large Data Bases
Top-k vectorial aggregation queries in a distributed environment

Journal of Parallel and Distributed Computing
Distributed threshold querying of general functions by a difference of monotonic representation

Proceedings of the VLDB Endowment
Probabilistic inverse ranking queries in uncertain databases

The VLDB Journal — The International Journal on Very Large Data Bases
TopRecs: Top-k algorithms for item-based collaborative filtering

Proceedings of the 14th International Conference on Extending Database Technology
Design and analysis of a ranking approach to private location-based services

ACM Transactions on Database Systems (TODS)
A new approach for processing ranked subsequence matching based on ranked union

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Exact indexing for support vector machines

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Top-k query processing for combinatorial objects using Euclidean distance

Proceedings of the 15th Symposium on International Database Engineering & Applications
Answering top-k queries over a mixture of attractive and repulsive dimensions

Proceedings of the VLDB Endowment
Optimal top-k generation of attribute combinations based on ranked lists

SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Personalized query evaluation in ring-based P2P networks

Information Sciences: an International Journal
iKernel: Exact indexing for support vector machines

Information Sciences: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

The family of threshold algorithm (ie, TA) has been widely studied for efficiently computing top-k queries. TA uses a sort-merge framework that assumes data lists are pre-sorted, and the ranking functions are monotone. However, in many database applications, attribute values are indexed by tree-structured indices (eg, B-tree, R-tree), and the ranking functions are not necessarily monotone. To answer top-k queries with ad-hoc ranking functions, this paper studies anindex-merge paradigm that performs progressive search over the space of joint states composed by multiple index nodes. We address two challenges for efficient query processing. First, to minimize the search complexity, we present a double-heap algorithm which supports not only progressive state search but also progressive state generation. Second, to avoid unnecessary disk access, we characterize a type of "empty-state" that does not contribute to the final results, and propose a new materialization model, join-signature, to prune empty-states. Our performance study shows that the proposed method achieves one order of magnitude speed-up over baseline solutions.