A content model for evaluating peer-to-peer searching techniques

Authors:
Brian F. Cooper
Affiliations:
Georgia Institute of Technology
Venue:
Proceedings of the 5th ACM/IFIP/USENIX international conference on Middleware
Year:
2004

Citing 18
Cited 5

On power-law relationships of the Internet topology

Proceedings of the conference on Applications, technologies, architectures, and protocols for computer communication
Evaluating the performance of distributed architectures for information retrieval using a variety of workloads

ACM Transactions on Information Systems (TOIS)
Chord: A scalable peer-to-peer lookup service for internet applications

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
A scalable content-addressable network

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
Search and replication in unstructured peer-to-peer networks

ICS '02 Proceedings of the 16th international conference on Supercomputing
Modern Information Retrieval

Modern Information Retrieval
A local search mechanism for peer-to-peer networks

Proceedings of the eleventh international conference on Information and knowledge management
Network topology generators: degree-based vs. structural

Proceedings of the 2002 conference on Applications, technologies, architectures, and protocols for computer communications
Replication strategies in unstructured peer-to-peer networks

Proceedings of the 2002 conference on Applications, technologies, architectures, and protocols for computer communications
Can Heterogeneity Make Gnutella Scalable?

IPTPS '01 Revised Papers from the First International Workshop on Peer-to-Peer Systems
Mapping the Gnutella Network: Macroscopic Properties of Large-Scale Peer-to-Peer Systems

IPTPS '01 Revised Papers from the First International Workshop on Peer-to-Peer Systems
Super-peer-based routing and clustering strategies for RDF-based peer-to-peer networks

WWW '03 Proceedings of the 12th international conference on World Wide Web
Routing Indices For Peer-to-Peer Systems

ICDCS '02 Proceedings of the 22 nd International Conference on Distributed Computing Systems (ICDCS'02)
Improving Search in Peer-to-Peer Networks

ICDCS '02 Proceedings of the 22 nd International Conference on Distributed Computing Systems (ICDCS'02)
Forwarding in a content-based network

Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications
Peer-to-peer information retrieval using self-organizing semantic overlay networks

Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications
Making gnutella-like P2P systems scalable

Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications
Measurement, modeling, and analysis of a peer-to-peer file-sharing workload

SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles

An optimal overlay topology for routing peer-to-peer searches

Proceedings of the ACM/IFIP/USENIX 2005 International Conference on Middleware
On increasing information availability in Gnutella-like peer-to-peer networks

ICC'09 Proceedings of the 2009 IEEE international conference on Communications
P2PIRB: benchmarking framework for P2PIR

Globe'10 Proceedings of the Third international conference on Data management in grid and peer-to-peer systems
Quickly routing searches without having to move content

IPTPS'05 Proceedings of the 4th international conference on Peer-to-Peer Systems
An optimal overlay topology for routing peer-to-peer searches

Middleware'05 Proceedings of the ACM/IFIP/USENIX 6th international conference on Middleware

Quantified Score

Hi-index	0.00

Visualization

Abstract

Simulation studies are frequently used to evaluate new peer-to-peer searching techniques as well as existing techniques on new applications. Unless these studies are accurate in their modeling of queries and documents, they may not reflect how search techniques will perform in real networks, leading to incorrect conclusions about which techniques are best. We describe how to model content so that simulations produce accurate results. We present a content model for peer-to-peer networks, which consists of a tripartite graph with edges connecting queries to the documents they match, and documents to the peers they are stored at. Our model also includes a set of statistics describing how often queries match the same documents, and how often similar documents are stored at the same peer. We can construct our tripartite content model by running queries over live data stored at real Internet nodes, and simulation results show that searching techniques do indeed perform differently in simulations using this "real" content model versus a randomly generated model. We then present an algorithm for using real content gathered from a small set of peers (say, 1,000) to generate a synthetic content model for large simulated networks (say, 10,000 nodes or more). Finally, we use a synthetic model generated from World Wide Web documents and queries to compare the performance of several search algorithms that have been reported in the literature.