Fundamental effects of clustering on the Euclidean embedding of internet hosts

  • Authors:
  • Sanghwan Lee;Zhi-Li Zhang;Sambit Sahu;Debanjan Saha;Mukund Srinivasan

  • Affiliations:
  • Kookmin University, Seoul, Korea;University of Minnesota, Minneapolis, MN;IBM T.J. Watson Research Center, Hawthorne, NY;IBM T.J. Watson Research Center, Hawthorne, NY;University of Minnesota, Minneapolis, MN

  • Venue:
  • NETWORKING'07 Proceedings of the 6th international IFIP-TC6 conference on Ad Hoc and sensor networks, wireless networks, next generation internet
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

The network distance estimation schemes based on Euclidean embedding have been shown to provide reasonably good overall accuracy. While some recent studies have revealed that triangle inequality violations (TIVs) inherent in network distances among Internet hosts fundamentally limit their accuracy, these Euclidean embedding methods are nonetheless appealing and useful for many applications due to their simplicity and scalability. In this paper, we investigate why the Euclidean embedding shows reasonable accuracy despite the prevalence of TIVs, focusing in particular on the effect of clustering among Internet hosts. Through mathematical analysis and experiments, we demonstrate that clustering of Internet hosts reduces the effective dimension of the distances, hence low-dimension Euclidean embedding suffices to produce reasonable accuracy. Our findings also provide us with good guidelines as to how to select landmarks to improve the accuracy, and explains why random selection of a large number of landmarks improves the accuracy.