A general top-k algorithm for web data sources

Authors:
Mehdi Badr;Dan Vodislav
Affiliations:
ETIS, CNRS, University of Cergy-Pontoise, France;ETIS, CNRS, University of Cergy-Pontoise, France
Venue:
DEXA'11 Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part I
Year:
2011

Citing 13
Cited 0

Minimal probing: supporting expensive predicates for top-k queries

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Optimizing Multi-Feature Queries for Image Databases

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Supporting Incremental Join Queries on Ranked Inputs

Proceedings of the 27th International Conference on Very Large Data Bases
Optimal aggregation algorithms for middleware

Journal of Computer and System Sciences - Special issu on PODS 2001
Towards Efficient Multi-Feature Queries in Heterogeneous Environments

ITCC '01 Proceedings of the International Conference on Information Technology: Coding and Computing
Evaluating top-k queries over web-accessible databases

ACM Transactions on Database Systems (TODS)
Supporting top-k join queries in relational databases

The VLDB Journal — The International Journal on Very Large Data Bases
RankSQL: query algebra and optimization for relational top-k queries

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Efficient Aggregation of Ranked Inputs

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Supporting ad-hoc ranking aggregates

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Optimizing top-k queries for middleware access: A unified cost-based approach

ACM Transactions on Database Systems (TODS)
Best position algorithms for top-k queries

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
A survey of top-k query processing techniques in relational database systems

ACM Computing Surveys (CSUR)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Several algorithms for top-k query processing over web data sources have been proposed, where sources return relevance scores for some query predicate, aggregated through a composition function. They assume specific conditions for the type of source access (sorted and/or random) and for the access cost, and propose various heuristics for choosing the next source to probe, while generally trying to refine the score of the most promising candidate. We present BreadthRefine (BR), a generic top-k algorithm, working for any combination of source access types and any cost settings. It proposes a new heuristic strategy, based on refining all the current top-k candidates, not only the best one. We present a rich panel of experiments comparing BR with state-of-the art algorithms and show that BR adapts to the specific settings of these algorithms, with lower cost.