Searching distributed collections with inference networks
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Summary cache: a scalable wide-area web cache sharing protocol
IEEE/ACM Transactions on Networking (TON)
Space/time trade-offs in hash coding with allowable errors
Communications of the ACM
Chord: A scalable peer-to-peer lookup service for internet applications
Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
EDUTELLA: a P2P networking infrastructure based on RDF
Proceedings of the 11th international conference on World Wide Web
Information Retrieval: Computational and Theoretical Aspects
Information Retrieval: Computational and Theoretical Aspects
PlanetP: Using Gossiping to Build Content Addressable Peer-to-Peer Information Sharing Communities
HPDC '03 Proceedings of the 12th IEEE International Symposium on High Performance Distributed Computing
P-Grid: a self-organizing structured P2P system
ACM SIGMOD Record
Canon in G Major: Designing DHTs with Hierarchical Structure
ICDCS '04 Proceedings of the 24th International Conference on Distributed Computing Systems (ICDCS'04)
RCV1: A New Benchmark Collection for Text Categorization Research
The Journal of Machine Learning Research
Measures of distributional similarity
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Guiding queries to information sources with InfoBeacons
Proceedings of the 5th ACM/IFIP/USENIX international conference on Middleware
Progressive Distributed Top-k Retrieval in Peer-to-Peer Networks
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Adlib: A Self-Tuning Index for Dynamic Peer-to-Peer Systems
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Improving collection selection with overlap awareness in P2P search engines
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Improving Unstructured Peer-to-Peer Systems by Adaptive Connection Establishment
IEEE Transactions on Computers
MINERVA: collaborative P2P search
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Peer counting and sampling in overlay networks: random walk methods
Proceedings of the twenty-fifth annual ACM symposium on Principles of distributed computing
Cost-Based Analysis of Hierarchical DHT Design
P2P '06 Proceedings of the Sixth IEEE International Conference on Peer-to-Peer Computing
Discovering and exploiting keyword and attribute-value co-occurrences to improve P2P routing indices
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
A comparison of document, sentence, and term event spaces
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
DHTs over Peer Clusters for Distributed Information Retrieval
AINA '07 Proceedings of the 21st International Conference on Advanced Networking and Applications
Improving distributed join efficiency with extended bloom filter operations
AINA '07 Proceedings of the 21st International Conference on Advanced Networking and Applications
Designing a DHT for low latency and high throughput
NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
Query-driven indexing for scalable peer-to-peer text retrieval
Future Generation Computer Systems
AlvisP2P: scalable peer-to-peer text retrieval in a structured P2P network
Proceedings of the VLDB Endowment
Adaptive distributed indexing for structured peer-to-peer networks
Proceedings of the 17th ACM conference on Information and knowledge management
Efficient peer-to-peer keyword searching
Proceedings of the ACM/IFIP/USENIX 2003 International Conference on Middleware
UniWiki: A Collaborative P2P System for Distributed Wiki Applications
WETICE '09 Proceedings of the 2009 18th IEEE International Workshops on Enabling Technologies: Infrastructures for Collaborative Enterprises
Aggregation of a term vocabulary for P2P-IRtest: a DHT stress test
DBISP2P'05/06 Proceedings of the 2005/2006 international conference on Databases, information systems, and peer-to-peer computing
Semantic overlay networks for p2p systems
AP2PC'04 Proceedings of the Third international conference on Agents and Peer-to-Peer Computing
The case for a hybrid p2p search infrastructure
IPTPS'04 Proceedings of the Third international conference on Peer-to-Peer Systems
Semi-structured semantic overlay for information retrieval in self-organizing networks
Proceedings of the 21st international conference companion on World Wide Web
ROIN: reputation-oriented inverted indexing for the P2P network
The Journal of Supercomputing
Hi-index | 0.00 |
Distributed hash tables (DHTs) are very efficient for querying based on key lookups. However, building huge term indexes, as required for IR-style keyword search, poses a scalability challenge for plain DHTs. Due to the large sizes of document term vocabularies, peers joining the network cause huge amounts of key inserts and, consequently, a large number of index maintenance messages. Thus, the key to exploiting DHTs for distributed information retrieval is to reduce index maintenance costs. Various approaches in this direction have been pursued, including the use of hybrid infrastructures, or changing the granularity of the inverted index to peer level. We show that indexing costs can be significantly reduced further by letting peers form groups in a self-organized fashion. Instead of each individual peer submitting index information separately, all peers of a group cooperate to publish the index updates to the DHT in batches. Our evaluation shows that this approach reduces index maintenance cost by an order of magnitude, while still keeping a complete and correct term index for query processing.