Probabilistic counting algorithms for data base applications
Journal of Computer and System Sciences
Summary cache: a scalable wide-area web cache sharing protocol
IEEE/ACM Transactions on Networking (TON)
Space/time trade-offs in hash coding with allowable errors
Communications of the ACM
Chord: A scalable peer-to-peer lookup service for internet applications
Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
A scalable content-addressable network
Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
Wide-area cooperative storage with CFS
SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
IEEE/ACM Transactions on Networking (TON)
Network topology generators: degree-based vs. structural
Proceedings of the 2002 conference on Applications, technologies, architectures, and protocols for computer communications
Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems
Middleware '01 Proceedings of the IFIP/ACM International Conference on Distributed Systems Platforms Heidelberg
PlanetP: Using Gossiping to Build Content Addressable Peer-to-Peer Information Sharing Communities
HPDC '03 Proceedings of the 12th IEEE International Symposium on High Performance Distributed Computing
Gossip-Based Computation of Aggregate Information
FOCS '03 Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science
Content-based retrieval in hybrid peer-to-peer networks
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Efficient Semantic-Based Content Search in P2P Network
IEEE Transactions on Knowledge and Data Engineering
Synopsis diffusion for robust aggregation in sensor networks
SenSys '04 Proceedings of the 2nd international conference on Embedded networked sensor systems
A Fast URL Lookup Engine for Content-Aware Multi-Gigabit Switches
AINA '05 Proceedings of the 19th International Conference on Advanced Information Networking and Applications - Volume 1
A Scalable P2P Platform for the Knowledge Grid
IEEE Transactions on Knowledge and Data Engineering
User modeling for full-text federated search in peer-to-peer networks
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
False Rate Analysis of Bloom Filter Replicas in Distributed Systems
ICPP '06 Proceedings of the 2006 International Conference on Parallel Processing
Hybrid global-local indexing for effcient peer-to-peer information retrieval
NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
Web text retrieval with a P2P query-driven index
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Enhancing P2P file-sharing with an internet-scale query processor
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Efficient peer-to-peer keyword searching
Proceedings of the ACM/IFIP/USENIX 2003 International Conference on Middleware
Tapestry: a resilient global-scale overlay for service deployment
IEEE Journal on Selected Areas in Communications
HRS: A Hybrid Replication Strategy for Exhaustive P2P Search
NPC '08 Proceedings of the IFIP International Conference on Network and Parallel Computing
Combating Index Poisoning in P2P File Sharing
ISA '09 Proceedings of the 3rd International Conference and Workshops on Advances in Information Security and Assurance
BloomCast: Efficient Full-Text Retrieval over Unstructured P2Ps with Guaranteed Recall
CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
Advances In Peer-To-Peer Content Search
Journal of Signal Processing Systems
Hi-index | 0.00 |
Current search mechanisms of DHT-based P2P systems can well handle a single keyword search problem. Other than single keyword search, multi-keyword search is quite popular and useful in many real applications. Simply using the solution for single keyword search will require distributed intersection/union operations in wide area networks, leading to unacceptable traffic cost. As it is well known that Bloom Filter (BF) is effective in reducing traffic, we would like to use BF encoding to handle multi-keyword search. Applying BF is not difficult, but how to get optimal results is not trivial. In this study we show, through mathematical proof, that the optimal setting of BF in terms of traffic cost is determined by the global statistical information of keywords, not the minimized false positive rate as claimed by previous methods. Through extensive experiments, we demonstrate how to obtain optimal settings. We further argue that the intersection order between sets is important for multi-keyword search. Thus, we design optimal order strategies based on BF for both "and" and "or" queries. To better evaluate the performance of this design, we conduct extensive simulations on TREC WT10G test collection and the query log of a commercial search engine. Results show that our design significantly reduces the search traffic of existing approach by 73%.