Distributed antipole clustering for efficient data search and management in Euclidean and metric spaces

  • Authors:
  • Alfredo Ferro;Rosalba Giugno;Misael Mongioví;Giuseppe Pigola;Alfredo Pulvirenti

  • Affiliations:
  • University of Catania, Dept. of Mathematics and Computer Science, Catania, Italy;University of Catania, Dept. of Mathematics and Computer Science, Catania, Italy;University of Catania, Dept. of Mathematics and Computer Science, Catania, Italy and Proteo S.p.A., Research and Development Department, Catania, Italy;University of Catania, Dept. of Mathematics and Computer Science, Catania, Italy;University of Catania, Dept. of Mathematics and Computer Science, Catania, Italy

  • Venue:
  • IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper a simple and efficient distributed version of the recently introduced Antipole Clustering algorithm for general metric spaces is proposed. This combines ideas from the M-Tree, the Multi-Vantage Point structure and the FQ-Tree to create a new structure in the "bisector tree" class, called the Antipole Tree. Bisection is based on the proximity to an "Antipole" pair of elements generated by a suitable linear randomized tournament. The final winners (A;B) of such a tournament are far enough apart to approximate the diameter of the splitting set. A simple linear algorithm computing Antipoles in Euclidean spaces with exponentially small approximation ratio is proposed. The Antipole Tree Clustering has been shown to be very effective in important applications such as range and k-nearest neighbor searching, mobile objects clustering in centralized wireless networks with movable base stations and multiple alignment of biological sequences. In many of such applications an efficient distributed clustering algorithm is needed. In the proposed distributed versions of Antipole Clustering the amount of data passed from one node to another is either constant or proportional to the number of nodes in the network. The Distributed Antipole Tree is equipped with additional information in order to perform efficient range search and dynamic clusters management. This is achieved by adding to the randomized tournaments technique, methodologies taken from established systems such as BFR and BIRCH*. Experiments show the good performance of the proposed algorithms on both real and synthetic data.