Optimizing Distributed Top-k Queries

  • Authors:
  • Thomas Neumann;Matthias Bender;Sebastian Michel;Ralf Schenkel;Peter Triantafillou;Gerhard Weikum

  • Affiliations:
  • Max-Planck-Institut Informatik, Saarbrücken, Germany;Max-Planck-Institut Informatik, Saarbrücken, Germany;École Polytechnique Fédérale de Lausanne, Switzerland;Max-Planck-Institut Informatik, Saarbrücken, Germany;RACTI and University of Patras, Greece;Max-Planck-Institut Informatik, Saarbrücken, Germany

  • Venue:
  • WISE '08 Proceedings of the 9th international conference on Web Information Systems Engineering
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Top-kquery processing is a fundamental building block for efficient ranking in a large number of applications. Efficiency is a central issue, especially for distributed settings, when the data is spread across different nodes in a network. This paper introduces novel optimization methods for top-kaggregation queries in such distributed environments that can be applied to all algorithms that fall into the frameworks of the prior TPUT and KLEE methods. The optimizations address 1) hierarchically grouping input lists into top-koperator trees and optimizing the tree structure, and 2) computing data-adaptive scan depths for different input sources. The paper presents comprehensive experiments with two different real-life datasets, using the ns-2 network simulator for a packet-level simulation of a large Internet-style network.