Finding the K highest-ranked answers in a distributed network

  • Authors:
  • Demetrios Zeinalipour-Yazti;Zografoula Vagena;Vana Kalogeraki;Dimitrios Gunopulos;Vassilis J. Tsotras;Michail Vlachos;Nick Koudas;Divesh Srivastava

  • Affiliations:
  • University of Cyprus, Dept. of Computer Science, 75 Kallipoleos Str., P.O. Box 20537, CY-1678, Nicosia, Cyprus;Microsoft Research Cambridge, Cambridge, United Kingdom;AUEB, Athens, Greece and UC - Riverside, Riverside, CA, United States;University of Athens, Athens, Greece;UC - Riverside, Riverside, CA, United States;IBM Research Zurich, Rueschlikon, Switzerland;University of Toronto, Toronto, ON, Canada;AT&T Research Labs, Florham Park, NJ, United States

  • Venue:
  • Computer Networks: The International Journal of Computer and Telecommunications Networking
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we present an algorithm for finding the k highest-ranked (or Top-k) answers in a distributed network. A Top-K query returns the subset of most relevant answers, in place of all answers, for two reasons: (i) to minimize the cost metric that is associated with the retrieval of all answers; and (ii) to improve the recall and the precision of the answer-set, such that the user is not overwhelmed with irrelevant results. Our study focuses on multi-hop distributed networks in which the data is accessible by traversing a network of nodes. Such a setting captures very well the computation framework of emerging Sensor Networks, Peer-to-Peer Networks and Vehicular Networks. We present the Threshold Join Algorithm (TJA), an efficient algorithm that utilizes a non-uniform threshold on the queried attribute in order to minimize the transfer of data when a query is executed. Additionally, TJA resolves queries in the network rather than in a centralized fashion which further minimizes the consumption of bandwidth and delay. We performed an extensive experimental evaluation of our algorithm using a real testbed of 75 workstations along with a trace-driven experimental methodology. Our results indicate that TJA requires an order of magnitude less communication than the state-of-the-art, scales well with respect to the parameter k and the network topology.