Communications of the ACM
Chord: A scalable peer-to-peer lookup service for internet applications
Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
Complex Queries in DHT-based Peer-to-Peer Networks
IPTPS '01 Revised Papers from the First International Workshop on Peer-to-Peer Systems
Proceedings of the twenty-third annual ACM symposium on Principles of distributed computing
Proof: A DHT-Based Peer-to-Peer Search Engine
WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
Web text retrieval with a P2P query-driven index
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Bubblestorm: resilient, probabilistic, and exhaustive peer-to-peer search
Proceedings of the 2007 conference on Applications, technologies, architectures, and protocols for computer communications
Efficient peer-to-peer keyword searching
Proceedings of the ACM/IFIP/USENIX 2003 International Conference on Middleware
ROAR: increasing the flexibility and performance of distributed search
Proceedings of the ACM SIGCOMM 2009 conference on Data communication
On the feasibility of unstructured peer-to-peer information retrieval
ICTIR'11 Proceedings of the Third international conference on Advances in information retrieval theory
Improving query correctness using centralized probably approximately correct (PAC) search
ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
PAC'nPost: a framework for a micro-blogging social network in an unstructured P2P network
Proceedings of the 21st international conference companion on World Wide Web
Ranked accuracy and unstructured distributed search
ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
Retrieval of trending keywords in a peer-to-peer micro-blogging OSN
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Hi-index | 0.00 |
We consider the problem of searching a document collection using a set of independent computers. That is, the computers do not cooperate with one another either (i) to acquire their local index of documents or (ii) during the retrieval of a document. During the acquisition phase, each computer is assumed to randomly sample a subset of the entire collection. During retrieval, the query is issued to a random subset of computers, each of which returns its results to the query-issuer, who consolidates the results. We examine how the number of computers, and the fraction of the collection that each computer indexes, affects performance in comparison to a traditional deterministic configuration. We provide analytic formulae that, given the number of computers and the fraction of the collection each computer indexes, provide the probability of an approximately correct search, where a "correct search" is defined to be the result of a deterministic search on the entire collection. We show that the randomized distributed search algorithm can have acceptable performance under a range of parameters settings. Simulation results confirm our analysis.