Building a distributed full-text index for the web
ACM Transactions on Information Systems (TOIS)
Modern Information Retrieval
Three-level caching for efficient query processing in large Web search engines
WWW '05 Proceedings of the 14th international conference on World Wide Web
Geographical partition for distributed web crawling
Proceedings of the 2005 workshop on Geographic information retrieval
Load balancing for term-distributed parallel retrieval
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Modeling performance-driven workload characterization of web search systems
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Analyzing imbalance among homogeneous index servers in a web search system
Information Processing and Management: an International Journal
A pipelined architecture for distributed text query evaluation
Information Retrieval
Information Processing and Management: an International Journal
Design trade-offs for search engine caching
ACM Transactions on the Web (TWEB)
Efficiency trade-offs in two-tier web search systems
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Quantifying performance and quality gains in distributed web search engines
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines
The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines
On the feasibility of multi-site web search engines
Proceedings of the 18th ACM conference on Information and knowledge management
ACM Transactions on Information Systems (TOIS)
Admission policies for caches of search engine results
SPIRE'07 Proceedings of the 14th international conference on String processing and information retrieval
Mining Query Logs: Turning Search Usage Data into Knowledge
Foundations and Trends in Information Retrieval
Hi-index | 0.00 |
In this invited talk we address the algorithmic problems behind a truly distributed Web search engine. The main goal is to reduce the cost of a Web search engine while keeping all the benefits of a centralized search engine in spite of the intrinsic network latency imposed by Internet. The key ideas to achieve this goal are layered caching, online prediction mechanisms and exploit the locality and distribution of queries.