Optimizing Distributed Top-k Queries

Authors:
Thomas Neumann;Matthias Bender;Sebastian Michel;Ralf Schenkel;Peter Triantafillou;Gerhard Weikum
Affiliations:
Max-Planck-Institut Informatik, Saarbrücken, Germany;Max-Planck-Institut Informatik, Saarbrücken, Germany;École Polytechnique Fédérale de Lausanne, Switzerland;Max-Planck-Institut Informatik, Saarbrücken, Germany;RACTI and University of Patras, Greece;Max-Planck-Institut Informatik, Saarbrücken, Germany
Venue:
WISE '08 Proceedings of the 9th international conference on Web Information Systems Engineering
Year:
2008

Citing 19
Cited 1

Using association rules for product assortment decisions: a case study

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Minimal probing: supporting expensive predicates for top-k queries

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Optimizing Multi-Feature Queries for Image Databases

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Query Processing Issues in Image(Multimedia) Databases

ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Optimal aggregation algorithms for middleware

Journal of Computer and System Sciences - Special issu on PODS 2001
Towards Efficient Multi-Feature Queries in Heterogeneous Environments

ITCC '01 Proceedings of the International Conference on Information Technology: Coding and Computing
Evaluating top-k queries over web-accessible databases

ACM Transactions on Database Systems (TODS)
Efficient top-K query calculation in distributed networks

Proceedings of the twenty-third annual ACM symposium on Principles of distributed computing
Progressive Distributed Top-k Retrieval in Peer-to-Peer Networks

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
The threshold join algorithm for top-k queries in distributed sensor networks

DMSN '05 Proceedings of the 2nd international workshop on Data management for sensor networks
KLEE: a framework for distributed top-k query algorithms

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Efficient Query Evaluation on Large Textual Collections in a Peer-to-Peer Environment

P2P '05 Proceedings of the Fifth IEEE International Conference on Peer-to-Peer Computing
Visualizing tags over time

Proceedings of the 15th international conference on World Wide Web
Reducing network traffic in unstructured P2P systems using Top-k queries

Distributed and Parallel Databases
Answering top-k queries using views

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Progressive and selective merge: computing top-k with ad-hoc ranking functions

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Top-k query evaluation with probabilistic guarantees

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Ad-hoc top-k query answering for data streams

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Efficient processing of distributed top-k queries

DEXA'05 Proceedings of the 16th international conference on Database and Expert Systems Applications

Distributed processing of continuous sliding-window k-NN queries for data stream filtering

World Wide Web

Quantified Score

Hi-index	0.00

Visualization

Abstract

Top-kquery processing is a fundamental building block for efficient ranking in a large number of applications. Efficiency is a central issue, especially for distributed settings, when the data is spread across different nodes in a network. This paper introduces novel optimization methods for top-kaggregation queries in such distributed environments that can be applied to all algorithms that fall into the frameworks of the prior TPUT and KLEE methods. The optimizations address 1) hierarchically grouping input lists into top-koperator trees and optimizing the tree structure, and 2) computing data-adaptive scan depths for different input sources. The paper presents comprehensive experiments with two different real-life datasets, using the ns-2 network simulator for a packet-level simulation of a large Internet-style network.