Optimal top-k generation of attribute combinations based on ranked lists

Authors:
Jiaheng Lu;Pierre Senellart;Chunbin Lin;Xiaoyong Du;Shan Wang;Xinxing Chen
Affiliations:
Renmin University of China, Beijing, China;Institut Telecom/ Telecom ParisTech, Paris, France;Renmin University of China, Beijing, China;Renmin University of China, Beijing, China;Renmin University of China, Beijing, China;Renmin University of China, Beijing, China
Venue:
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Year:
2012

Citing 21
Cited 2

Optimal aggregation algorithms for middleware

PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Minimal probing: supporting expensive predicates for top-k queries

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Optimizing Multi-Feature Queries for Image Databases

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Query Processing Issues in Image(Multimedia) Databases

ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Query word deletion prediction

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Optimal aggregation algorithms for middleware

Journal of Computer and System Sciences - Special issu on PODS 2001
Evidence Combination in Medical Data Mining

ITCC '04 Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC'04) Volume 2 - Volume 2
Adaptive Processing of Top-k Queries in XML

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Efficient and self-tuning incremental query expansion for top-k query processing

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Structure and content scoring for XML

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Supporting ad-hoc ranking aggregates

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Progressive and selective merge: computing top-k with ad-hoc ranking functions

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Efficient top-k aggregation of ranked inputs

ACM Transactions on Database Systems (TODS)
Joining ranked inputs in practice

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Extracting k most important groups from data efficiently

Data & Knowledge Engineering
A survey of top-k query processing techniques in relational database systems

ACM Computing Surveys (CSUR)
Effective XML Keyword Search with Relevance Oriented Ranking

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Towards an Effective XML Keyword Search

IEEE Transactions on Knowledge and Data Engineering
Efficient and generic evaluation of ranked queries

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
XClean: Providing valid spelling suggestions for XML keyword queries

ICDE '11 Proceedings of the 2011 IEEE 27th International Conference on Data Engineering
Top-k keyword search over probabilistic XML data

ICDE '11 Proceedings of the 2011 IEEE 27th International Conference on Data Engineering

Big data challenge: a data management perspective

Frontiers of Computer Science: Selected Publications from Chinese Universities
GeoRank: an efficient location-aware news feed ranking system

Proceedings of the 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this work, we study a novel query type, called top-k,m queries. Suppose we are given a set of groups and each group contains a set of attributes, each of which is associated with a ranked list of tuples, with ID and score. All lists are ranked in decreasing order of the scores of tuples. We are interested in finding the best combinations of attributes, each combination involving one attribute from each group. More specifically, we want the top-k combinations of attributes according to the corresponding top-m tuples with matching IDs. This problem has a wide range of applications from databases to search engines on traditional and non-traditional types of data (relational data, XML, text, etc.). We show that a straightforward extension of an optimal top-k algorithm, the Threshold Algorithm (TA), has shortcomings in solving the km problem, as it needs to compute a large number of intermediate results for each combination and reads moreinputs than needed. To overcome this weakness, we provide here, for the first time, a provably instance-optimal algorithm and further develop optimizations for efficient query evaluation to reduce computational and memory costs and the number of accesses. We demonstrate experimentally the scalability and efficiency of our algorithms over three real applications.