The design and analysis of spatial data structures
The design and analysis of spatial data structures
A vector space model for automatic indexing
Communications of the ACM
Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems
Middleware '01 Proceedings of the IFIP/ACM International Conference on Distributed Systems Platforms Heidelberg
pSearch: information retrieval in structured overlays
ACM SIGCOMM Computer Communication Review
Peer-to-peer information retrieval using self-organizing semantic overlay networks
Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications
The Essence of P2P: A Reference Architecture for Overlay Networks
P2P '05 Proceedings of the Fifth IEEE International Conference on Peer-to-Peer Computing
Efficient peer-to-peer keyword searching
Proceedings of the ACM/IFIP/USENIX 2003 International Conference on Middleware
Hi-index | 0.00 |
Performing information retrieval (IR) efficiently in a distributed environment is currently one of the main challenges in IR. Document representations are distributed among nodes in a manner that allows a query processing algorithm to efficiently direct queries to those nodes that contribute to the result. Existing term-based document distribution algorithms do not scale with large collection sizes or manyterm queries because they incur heavy network traffic during the distribution and query phases. We propose a novel algorithm for document distribution, namely distance-based document distribution. The distribution obtained by our algorithm allows answering any IR query effectively by contacting only a few nodes, independent of both document collection size and network size, thereby improving efficiency. We accomplish this by linearizing the information retrieval search space such that it reflects the ranking formula which will be used for later retrieval. Our experimental evaluation indicates that effective information retrieval can be efficiently accomplished in distributed networks.