Processing of Rank Joins in Highly Distributed Systems

  • Authors:
  • Christos Doulkeridis;Akrivi Vlachou;Kjetil Nørvåg;Yannis Kotidis;Neoklis Polyzotis

  • Affiliations:
  • -;-;-;-;-

  • Venue:
  • ICDE '12 Proceedings of the 2012 IEEE 28th International Conference on Data Engineering
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we study efficient processing of rank joins in highly distributed systems, where servers store fragments of relations in an autonomous manner. Existing rank-join algorithms exhibit poor performance in this setting due to excessive communication costs or high latency. We propose a novel distributed rank-join framework that employs data statistics, maintained as histograms, to determine the subset of each relational fragment that needs to be fetched to generate the top-k join results. At the heart of our framework lies a distributed score bound estimation algorithm that produces sufficient score bounds for each relation, that guarantee the correctness of the rank-join result set, when the histograms are accurate. Furthermore, we propose a generalization of our framework that supports approximate statistics, in the case that the exact statistical information is not available. An extensive experimental study validates the efficiency of our framework and demonstrates its advantages over existing methods.