Towards robust indexing for ranked queries

Authors:
Dong Xin;Chen Chen;Jiawei Han
Affiliations:
Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL;Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL;Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL
Venue:
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Year:
2006

Citing 12
Cited 13

Combining fuzzy information from multiple systems (extended abstract)

PODS '96 Proceedings of the fifteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Processing queries by linear constraints

PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Fuzzy queries in multimedia database systems

PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Efficient searching with linear constraints

PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
On Finding the Maxima of a Set of Vectors

Journal of the ACM (JACM)
The onion technique: indexing for linear optimization queries

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Optimal aggregation algorithms for middleware

PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
PREFER: a system for the efficient execution of multi-parametric ranked queries

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Introduction to algorithms

Introduction to algorithms
Data Structures with C++

Data Structures with C++
The Skyline Operator

Proceedings of the 17th International Conference on Data Engineering
Evaluating Top-k Selection Queries

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases

Spark: top-k keyword query in relational databases

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Probabilistic ranked queries in uncertain databases

EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
A survey of top-k query processing techniques in relational database systems

ACM Computing Surveys (CSUR)
The partitioned-layer index: Answering monotone top-k queries using the convex skyline and partitioning-merging technique

Information Sciences: an International Journal
Aggregate computation over data streams

APWeb'08 Proceedings of the 10th Asia-Pacific web conference on Progress in WWW research and development
Probabilistic inverse ranking queries in uncertain databases

The VLDB Journal — The International Journal on Very Large Data Bases
Efficient and generic evaluation of ranked queries

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Answering top-k queries over a mixture of attractive and repulsive dimensions

Proceedings of the VLDB Endowment
Efficient approximation of the maximal preference scores by lightweight cubic views

Proceedings of the 15th International Conference on Extending Database Technology
Personalized query evaluation in ring-based P2P networks

Information Sciences: an International Journal
Subspace top-k query processing using the hybrid-layer index with a tight bound

Data & Knowledge Engineering
Efficient top-k query answering using cached views

Proceedings of the 16th International Conference on Extending Database Technology
Branch-and-bound algorithm for reverse top-k queries

Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data

Quantified Score

Hi-index	0.00

Visualization

Abstract

Top-k query asks for k tuples ordered according to a specific ranking function that combines the values from multiple participating attributes. The combined score function is usually linear. To efficiently answer top-k queries, preprocessing and indexing the data have been used to speed up the run time performance. Many indexing methods allow the online query algorithms progressively retrieve the data and stop at a certain point. However, in many cases, the number of data accesses is sensitive to the query parameters (i.e., linear weights in the score functions).In this paper, we study the sequentially layered indexing problem where tuples are put into multiple consecutive layers and any top-k query can be answered by at most k layers of tuples. We propose a new criterion for building the layered index. A layered index is robust if for any k, the number of tuples in the top k layers is minimal in comparison with all the other alternatives. The robust index guarantees the worst case performance for arbitrary query parameters. We derive a necessary and sufficient condition for robust index. The problem is shown solvable within O(ndlog n) (where d is the number of dimensions, and n is the number of tuples). To reduce the high complexity of the exact solution, we develop an approximate approach, which has time complexity O(2d n(log n)r(d)-1), where r(d) = ⌈d/2⌉ + ⌊d/2⌋ ⌈d/2⌉. Our experimental results show that our proposed method outperforms the best known previous methods.