Distributed cache table: efficient query-driven processing of multi-term queries in P2P networks
P2PIR '06 Proceedings of the international workshop on Information retrieval in peer-to-peer networks
ALVIS peers: a scalable full-text peer-to-peer retrieval engine
P2PIR '06 Proceedings of the international workshop on Information retrieval in peer-to-peer networks
Discovering and exploiting keyword and attribute-value co-occurrences to improve P2P routing indices
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Query-driven indexing for peer-to-peer text retrieval
Proceedings of the 16th international conference on World Wide Web
Web text retrieval with a P2P query-driven index
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Data allocation scheme based on term weight for P2P information retrieval
Proceedings of the 9th annual ACM international workshop on Web information and data management
SemreX: Efficient search in a semantic overlay for literature retrieval
Future Generation Computer Systems
Query-driven indexing for scalable peer-to-peer text retrieval
Proceedings of the 2nd international conference on Scalable information systems
Mining query logs to optimize index partitioning in parallel web search engines
Proceedings of the 2nd international conference on Scalable information systems
Query-driven indexing for scalable peer-to-peer text retrieval
Future Generation Computer Systems
Optimizing Distributed Top-k Queries
WISE '08 Proceedings of the 9th international conference on Web Information Systems Engineering
AlvisP2P: scalable peer-to-peer text retrieval in a structured P2P network
Proceedings of the VLDB Endowment
Peer-to-peer similarity search over widely distributed document collections
Proceedings of the 2008 ACM workshop on Large-Scale distributed systems for information retrieval
Top-k aggregation using intersections of ranked inputs
Proceedings of the Second ACM International Conference on Web Search and Data Mining
Distributed top-k aggregation queries at large
Distributed and Parallel Databases
On the feasibility of multi-site web search engines
Proceedings of the 18th ACM conference on Information and knowledge management
Aggregation of Document Frequencies in Unstructured P2P Networks
WISE '09 Proceedings of the 10th International Conference on Web Information Systems Engineering
Flexible Bloom Filters for Searching Textual Objects
Agents and Peer-to-Peer Computing
P-Terse: a peer-to-peer based text retrieval and search system
Proceedings of the 2005 joint Chinese-German conference on Cognitive systems
Keyword searching in structured overlays via content distance addressing
DBISP2P'05/06 Proceedings of the 2005/2006 international conference on Databases, information systems, and peer-to-peer computing
Flood little, cache more: effective result-reuse in P2P IR systems
DASFAA'08 Proceedings of the 13th international conference on Database systems for advanced applications
IQN routing: integrating quality and novelty in P2P querying and ranking
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Peer-to-Peer Information Retrieval: An Overview
ACM Transactions on Information Systems (TOIS)
A term-based inverted index partitioning model for efficient distributed query processing
ACM Transactions on the Web (TWEB)
Hi-index | 0.00 |
We study the problem of evaluating ranked (top-k) queries on textual collections ranging from multiple giga-bytes to terabytes in size. We focus on the case of a global index organization in a highly distributed environment, and consider a class of ranking functions that includes common variants of the Cosine and Okapi measures. The main bottleneck in such a scenario is the amount of communication required during query evaluation. We propose several efficient query evaluation schemes and evaluate their performance. Our results on real search engine query traces and over 120 million web pages show that after careful optimization such queries can be evaluated at a reasonable cost, while challenges remain for even larger collections and more general classes of ranking functions.