TR-clustering: Alleviating the impact of false clustering on P2P overlay networks

  • Authors:
  • Marc Sínchez-Artigas;Pedro García-López;Antonio F. Gómez-Skarmeta;José Santa

  • Affiliations:
  • Department of Computer Engineering and Mathematics, Campus Sescelades, Universitat Rovira i Virgili, Tarragona, Spain;Department of Computer Engineering and Mathematics, Campus Sescelades, Universitat Rovira i Virgili, Tarragona, Spain;Department of Information and Communications Engineering, Campus de Espinardo, Universidad de Murcia, Murcia, Spain;Department of Information and Communications Engineering, Campus de Espinardo, Universidad de Murcia, Murcia, Spain

  • Venue:
  • Computer Networks: The International Journal of Computer and Telecommunications Networking
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Designing topologically-aware overlays is a recurrent subject in peer-to-peer research. Although there exists a plethora of approaches, Internet coordinate systems such as GNP (which attempt to predict the pair-wise O(N^2) latencies between N nodes using only O(N) measurements) have become the most attractive approach to make the overlay connectivity structures congruent with the underlying IP-level network topology. With appropriate input, coordinate systems allow complex distributed problems to be solved geometrically, including multicast, server selection, etc. For these applications, and presumably others like that, exact topological information is not required and it is sufficient to use informative hints about the relative positions of Internet clients. Clustering operation, which attempts to partition a set of objects into several subsets that are distinguishable under some criterion of similarity, could significantly ease these operations. However, when the main objective is clustering nodes, Internet coordinate systems present strong limitations to identify the right clusters, a problem known as false clustering. In this work, the authors answer a fundamental question that has been obscured in proximity techniques so far: how often false clustering happens in reality and how much this affects the overall performance of an overlay. To that effect, the authors present a novel approach called TR-Clustering to cluster nodes in overlay networks based on their physical positions on the Internet. To be specific, TR-Clustering uses the Internet routers with high vertex betweenness centrality to cluster participating nodes. Informally, the betweenness centrality of a router is defined as the fraction of shortest paths between all pairs of nodes running through it. Simulation results illustrate that TR-Clustering is superior to existing techniques, with less than a 5% of falsely clustered peers (of course, relative to the datasets utilized in their evaluation).