Estimating peer similarity using distance of shared files

  • Authors:
  • Yuval Shavitt;Ela Weinsberg;Udi Weinsberg

  • Affiliations:
  • Tel-Aviv University, Israel;Tel-Aviv University, Israel;Tel-Aviv University, Israel

  • Venue:
  • IPTPS'10 Proceedings of the 9th international conference on Peer-to-peer systems
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Peer-to-Peer (p2p) networks are used by millions of users for sharing content. As these networks become ever more popular, it becomes increasingly difficult to find useful content in the abundance of shared files. Modern p2p networks and similar social services must adopt new methods to help users efficiently locate content, and to this end approximate meta-data search and recommendation systems are utilized. However, meta-data is often missing or wrong, and recommender systems are not fitted to handle p2p networks due to inherent difficulties such as implicit ranking, noise in user generated content and the extreme dimensions and sparseness of the network. This paper attempts to bridge this gap by suggesting a new metric for peer similarity, which can be used to improve content search and recommendation in large scale p2p networks and semi-centralized services, such as p2p IPTV. Unlike commonly used vector distance functions, which is shown to be unfitted for p2p networks due to low overlap between peers, this work leverages a file similarity graph for estimating the similarity between peers that have little or no overlap of shared files. Using 100k peers sharing over 500k songs in the Gnutella network, we show the advantages of the proposed metric over commonly used geographical locality and vector distance measures.