An Architecture for Hybrid P2P Free-Text Search

Authors:
Avi Rosenfeld;Claudia V. Goldman;Gal A. Kaminka;Sarit Kraus
Affiliations:
Department of Industrial Engineering, Jerusalem College of Technology, Jerusalem, Israel and Department of Computer Science Bar Ilan University, Ramat Gan, Israel;Samsung Telecom Research Israel, Herzliya, Israel;Department of Computer Science Bar Ilan University, Ramat Gan, Israel;Department of Computer Science Bar Ilan University, Ramat Gan, Israel
Venue:
CIA '07 Proceedings of the 11th international workshop on Cooperative Information Agents XI
Year:
2007

Citing 12
Cited 1

Another stemmer

ACM SIGIR Forum
GlOSS: text-source discovery over the Internet

ACM Transactions on Database Systems (TODS)
Chord: A scalable peer-to-peer lookup service for internet applications

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
Search and replication in unstructured peer-to-peer networks

ICS '02 Proceedings of the 16th international conference on Supercomputing
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Making gnutella-like P2P systems scalable

Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications
Keyword Search in DHT-Based Peer-to-Peer Networks

ICDCS '05 Proceedings of the 25th IEEE International Conference on Distributed Computing Systems
Thumbs up?: sentiment classification using machine learning techniques

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Enhancing P2P file-sharing with an internet-scale query processor

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Efficient peer-to-peer keyword searching

Proceedings of the ACM/IFIP/USENIX 2003 International Conference on Middleware
The case for a hybrid p2p search infrastructure

IPTPS'04 Proceedings of the Third international conference on Peer-to-Peer Systems
Tapestry: a resilient global-scale overlay for service deployment

IEEE Journal on Selected Areas in Communications

Crawling BitTorrent DHTs for fun and profit

WOOT'10 Proceedings of the 4th USENIX conference on Offensive technologies

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recent advances in peer to peer (P2P) search algorithms have presented viable structured and unstructured approaches for full-text search. We posit that these existing approaches are each best suited for different types of queries. We present PHIRST, the first system to facilitate effective full-text search within P2P networks. PHIRST works by effectively leveraging between the relative strengths of these approaches. Similar to structured approaches, agents first publish terms within their stored documents. However, frequent terms are quickly identified and not exhaustively stored, resulting in a significantly reduction in the system's storage requirements. During query lookup, agents use unstructured searches to compensate for the lack of fully published terms. Additionally, they explicitly weigh between the costs involved with structured and unstructured approaches, allowing for a significant reduction in query costs. We evaluated the effectiveness of our approach using both real-world and artificial queries. We found that in most situations our approach yields near perfect recall. We discuss the limitations of our system, as well as possible compensatory strategies.