Probabilistic file indexing and searching in unstructured peer-to-peer networks

Authors:
An-Hsun Cheng;Yuh-Jzer Joung
Affiliations:
Department of Information Management, National Taiwan University, No. 1, Sec. 4, Roosevelt Road, Taipei 106, Taiwan, ROC;Department of Information Management, National Taiwan University, No. 1, Sec. 4, Roosevelt Road, Taipei 106, Taiwan, ROC
Venue:
Computer Networks: The International Journal of Computer and Telecommunications Networking
Year:
2006

Citing 14
Cited 5

Space/time trade-offs in hash coding with allowable errors

Communications of the ACM
Chord: A scalable peer-to-peer lookup service for internet applications

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
A scalable content-addressable network

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
Search and replication in unstructured peer-to-peer networks

SIGMETRICS '02 Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
A local search mechanism for peer-to-peer networks

Proceedings of the eleventh international conference on Information and knowledge management
Replication strategies in unstructured peer-to-peer networks

Proceedings of the 2002 conference on Applications, technologies, architectures, and protocols for computer communications
Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems

Middleware '01 Proceedings of the IFIP/ACM International Conference on Distributed Systems Platforms Heidelberg
PlanetP: Using Gossiping to Build Content Addressable Peer-to-Peer Information Sharing Communities

HPDC '03 Proceedings of the 12th IEEE International Symposium on High Performance Distributed Computing
Improving Search in Peer-to-Peer Networks

ICDCS '02 Proceedings of the 22 nd International Conference on Distributed Computing Systems (ICDCS'02)
BRITE: Universal Topology Generation from a User''s Perspective

BRITE: Universal Topology Generation from a User''s Perspective
Finding Good Peers in Peer-to-Peer Networks

IPDPS '02 Proceedings of the 16th International Symposium on Parallel and Distributed Processing
[15] Peer-to-Peer Architecture Case Study: Gnutella Network

P2P '01 Proceedings of the First International Conference on Peer-to-Peer Computing
Efficient peer-to-peer keyword searching

Proceedings of the ACM/IFIP/USENIX 2003 International Conference on Middleware
Tapestry: a resilient global-scale overlay for service deployment

IEEE Journal on Selected Areas in Communications

On the self-organization of a hybrid peer-to-peer system

Journal of Network and Computer Applications
Highly available component sharing in large-scale multi-tenant cloud systems

Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Cooperating with free riders in unstructured P2P networks

Computer Networks: The International Journal of Computer and Telecommunications Networking
Capitalizing on free riders in p2p networks

Euro-Par'07 Proceedings of the 13th international Euro-Par conference on Parallel Processing
Distributed caching in unstructured peer-to-peer file sharing networks

Computers and Electrical Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Thanks to the advance of network and computing technology, Peer-to-Peer (P2P) has become a popular way for file sharing. A huge amount of files can now be directly accessed and downloaded by a simple mouse click. Among the types of P2P networks, unstructured architecture has been proven quite successful, mainly due to its simplicity and robustness. However, searching for distant and rare files is still a challenging problem in unstructured P2P networks. Existing approaches either have poor response time, or generate too much network traffic. In this paper we propose a simple, practical, yet powerful index scheme to enhance search in unstructured P2P networks. The index scheme uses a data structure ''Bloom filters'' to index files shared at each node, and then lets nodes gossip to one another to exchange their Bloom filters. In effect, each node indexes a random set of files in the network, thereby allowing every query to have a constant probability to be successfully resolved within a fixed search space. The experimental results show that our approach can improve the search in Gnutella by an order of magnitude. For example, in a typical Gnutella network consisting of about 89,000 nodes, by replicating a node's Bloom filter to less than 0.45% of the nodes in the network, 70% of the queries can be resolved within a search space of 200 nodes. In contrast, within the same search space size, only 1.6% of the queries can be resolved without the index scheme; or, alternatively, more than 48,000 nodes need to be searched in Gnutella in order to reach the same success rate as our index scheme.