Hybrid Partition Inverted Files: Experimental Validation
ECDL '02 Proceedings of the 6th European Conference on Research and Advanced Technology for Digital Libraries
Efficient query evaluation using a two-level retrieval process
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
ACM Transactions on Information Systems (TOIS)
Load balancing for term-distributed parallel retrieval
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Hybrid global-local indexing for effcient peer-to-peer information retrieval
NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
A pipelined architecture for distributed text query evaluation
Information Retrieval
Improved techniques for result caching in web search engines
Proceedings of the 18th international conference on World wide web
Pregel: a system for large-scale graph processing - "ABSTRACT"
Proceedings of the 28th ACM symposium on Principles of distributed computing
Two-Dimensional Distributed Inverted Files
SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
Modern Information Retrieval
A bridging model for multi-core computing
Journal of Computer and System Sciences
Performance evaluation of improved web search algorithms
VECPAR'10 Proceedings of the 9th international conference on High performance computing for computational science
An evaluation of fault-tolerant query processing for web search engines
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
Hi-index | 0.00 |
Web search engines achieve efficient performance by partitioning and replicating the indexing data structure used to support query processing. Current practice simply partitions and replicates the text collection on the set of cluster processors and then constructs in each processor an index data structure. This paper proposes a different approach by constructing an index data structure that properly considers the fact that data is partitioned and replicated. This leads to a so-called 3D indexing strategy that outperforms current approaches. Performance is further boosted by introducing an application caching scheme devised to hold most frequently issued queries.