Efficient top-k aggregation of ranked inputs

Authors:
Nikos Mamoulis;Man Lung Yiu;Kit Hung Cheng;David W. Cheung
Affiliations:
University of Hong Kong, Pokfulam Road, Hong Kong;Aalborg University, Aalborg, Denmark;University of Hong Kong, Pokfulam Road, Hong Kong;University of Hong Kong, Pokfulam Road, Hong Kong
Venue:
ACM Transactions on Database Systems (TODS)
Year:
2007

Citing 29
Cited 23

Nearest neighbor queries

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
On saying “Enough already!” in SQL

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Supporting similarity queries in MARS

MULTIMEDIA '97 Proceedings of the fifth ACM international conference on Multimedia
Combining fuzzy information from multiple systems

Journal of Computer and System Sciences
A framework for expressing and combining preferences

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
The onion technique: indexing for linear optimization queries

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Optimal aggregation algorithms for middleware

PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Efficient k-NN search on vertically decomposed data

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Minimal probing: supporting expensive predicates for top-k queries

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Combining fuzzy information: an overview

ACM SIGMOD Record
Top-k selection queries over relational databases: Mapping strategies and performance evaluation

ACM Transactions on Database Systems (TODS)
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals

Data Mining and Knowledge Discovery
The Skyline Operator

Proceedings of the 17th International Conference on Data Engineering
Optimizing Multi-Feature Queries for Image Databases

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Supporting Incremental Join Queries on Ranked Inputs

Proceedings of the 27th International Conference on Very Large Data Bases
Query Processing Issues in Image(Multimedia) Databases

ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Efficient similarity search and classification via rank aggregation

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Towards Efficient Multi-Feature Queries in Heterogeneous Environments

ITCC '01 Proceedings of the International Conference on Information Technology: Coding and Computing
Algorithms and applications for answering ranked queries using ranked views

The VLDB Journal — The International Journal on Very Large Data Bases
Evaluating top-k queries over web-accessible databases

ACM Transactions on Database Systems (TODS)
Rank-aware query optimization

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Efficient Aggregation of Ranked Inputs

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Continuous monitoring of top-k queries over sliding windows

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Branch-and-bound processing of ranked queries

Information Systems
Foundations of preferences in database systems

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Joining ranked inputs in practice

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Supporting top-K join queries in relational databases

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Top-k query evaluation with probabilistic guarantees

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Multi-objective query processing for database systems

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30

Evaluating rank joins with optimal cost

Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Extracting k most important groups from data efficiently

Data & Knowledge Engineering
Joining the results of heterogeneous search engines

Information Systems
Efficient search for the top-k probable nearest neighbors in uncertain databases

Proceedings of the VLDB Endowment
Depth estimation for ranking query optimization

The VLDB Journal — The International Journal on Very Large Data Bases
Selective-NRA Algorithms for Top-k Queries

APWeb/WAIM '09 Proceedings of the Joint International Conferences on Advances in Data and Web Management
Robust and efficient algorithms for rank join evaluation

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Engineering search computing applications: vision and challenges

Proceedings of the the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
Optimal algorithms for evaluating rank joins in database systems

ACM Transactions on Database Systems (TODS)
Probabilistic ranking over relations

Proceedings of the 13th International Conference on Extending Database Technology
Durable top-k search in document archives

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Efficient processing of exact top-k queries over disk-resident sorted lists

The VLDB Journal — The International Journal on Very Large Data Bases
Efficient processing of top-k spatial preference queries

Proceedings of the VLDB Endowment
Supporting early pruning in top-k query processing on massive data

Information Processing Letters
Efficient top-k retrieval for user preference queries

Proceedings of the 2011 ACM Symposium on Applied Computing
The rank join problem

Search computing
Efficient processing of top-k spatial keyword queries

SSTD'11 Proceedings of the 12th international conference on Advances in spatial and temporal databases
Pani: a novel algorithm for fast discovery of putative target nodes in signaling networks

Proceedings of the 2nd ACM Conference on Bioinformatics, Computational Biology and Biomedicine
Optimal top-k generation of attribute combinations based on ranked lists

SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Top-k linked data query processing

ESWC'12 Proceedings of the 9th international conference on The Semantic Web: research and applications
Diversifying top-k results

Proceedings of the VLDB Endowment
TJJE: An efficient algorithm for top-k join on massive data

Information Sciences: an International Journal
Efficient top-k spatial distance joins

SSTD'13 Proceedings of the 13th international conference on Advances in Spatial and Temporal Databases

Quantified Score

Hi-index	0.00

Visualization

Abstract

A top-k query combines different rankings of the same set of objects and returns the k objects with the highest combined score according to an aggregate function. We bring to light some key observations, which impose two phases that any top-k algorithm, based on sorted accesses, should go through. Based on them, we propose a new algorithm, which is designed to minimize the number of object accesses, the computational cost, and the memory requirements of top-k search with monotone aggregate functions. We provide an analysis for its cost and show that it is always no worse than the baseline “no random accesses” algorithm in terms of computations, accesses, and memory required. As a side contribution, we perform a space analysis, which indicates the memory requirements of top-k algorithms that only perform sorted accesses. For the case, where the required space exceeds the available memory, we propose disk-based variants of our algorithm. We propose and optimize a multiway top-k join operator, with certain advantages over evaluation trees of binary top-k join operators. Finally, we define and study the computation of top-k cubes and the implementation of roll-up and drill-down operations in such cubes. Extensive experiments with synthetic and real data show that, compared to previous techniques, our method accesses fewer objects, while being orders of magnitude faster.