Fundamental effects of clustering on the Euclidean embedding of internet hosts

Authors:
Sanghwan Lee;Zhi-Li Zhang;Sambit Sahu;Debanjan Saha;Mukund Srinivasan
Affiliations:
Kookmin University, Seoul, Korea;University of Minnesota, Minneapolis, MN;IBM T.J. Watson Research Center, Hawthorne, NY;IBM T.J. Watson Research Center, Hawthorne, NY;University of Minnesota, Minneapolis, MN
Venue:
NETWORKING'07 Proceedings of the 6th international IFIP-TC6 conference on Ad Hoc and sensor networks, wireless networks, next generation internet
Year:
2007

Citing 9
Cited 1

A scalable content-addressable network

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
IDMaps: a global internet host distance estimation service

IEEE/ACM Transactions on Networking (TON)
Virtual landmarks for the internet

Proceedings of the 3rd ACM SIGCOMM conference on Internet measurement
K-means clustering via principal component analysis

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Vivaldi: a decentralized network coordinate system

Proceedings of the 2004 conference on Applications, technologies, architectures, and protocols for computer communications
On suitability of Euclidean embedding of internet hosts

SIGMETRICS '06/Performance '06 Proceedings of the joint international conference on Measurement and modeling of computer systems
A structural approach to latency prediction

Proceedings of the 6th ACM SIGCOMM conference on Internet measurement
Internet routing policies and round-trip-times

PAM'05 Proceedings of the 6th international conference on Passive and Active Network Measurement
Leopard: a locality aware peer-to-peer system with no hot spot

NETWORKING'05 Proceedings of the 4th IFIP-TC6 international conference on Networking Technologies, Services, and Protocols; Performance of Computer and Communication Networks; Mobile and Wireless Communication Systems

Ectropy of diversity measures for populations in Euclidean space

Information Sciences: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

The network distance estimation schemes based on Euclidean embedding have been shown to provide reasonably good overall accuracy. While some recent studies have revealed that triangle inequality violations (TIVs) inherent in network distances among Internet hosts fundamentally limit their accuracy, these Euclidean embedding methods are nonetheless appealing and useful for many applications due to their simplicity and scalability. In this paper, we investigate why the Euclidean embedding shows reasonable accuracy despite the prevalence of TIVs, focusing in particular on the effect of clustering among Internet hosts. Through mathematical analysis and experiments, we demonstrate that clustering of Internet hosts reduces the effective dimension of the distances, hence low-dimension Euclidean embedding suffices to produce reasonable accuracy. Our findings also provide us with good guidelines as to how to select landmarks to improve the accuracy, and explains why random selection of a large number of landmarks improves the accuracy.