Query forwarding in geographically distributed search engines

Authors:
B. Barla Cambazoglu;Emre Varol;Enver Kayaaslan;Cevdet Aykanat;Ricardo Baeza-Yates
Affiliations:
Yahoo!, Barcelona, Spain;Bilkent University, Ankara, Turkey;Bilkent University, Ankara, Turkey;Bilkent University, Ankara, Turkey;Yahoo!, Ankara, Turkey
Venue:
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Year:
2010

Citing 13
Cited 12

Searching distributed collections with inference networks

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Predictive caching and prefetching of query results in search engines

WWW '03 Proceedings of the 12th international conference on World Wide Web
Three-level caching for efficient query processing in large Web search engines

WWW '05 Proceedings of the 14th international conference on World Wide Web
Answering top-k queries using views

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
The impact of caching on search engines

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
On the feasibility of geographically distributed web crawling

Proceedings of the 3rd international conference on Scalable information systems
Top-k aggregation using intersections of ranked inputs

Proceedings of the Second ACM International Conference on Web Search and Data Mining
Improved techniques for result caching in web search engines

Proceedings of the 18th international conference on World wide web
Efficiency trade-offs in two-tier web search systems

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Quantifying performance and quality gains in distributed web search engines

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
The Geographical Life of Search

WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
On the feasibility of multi-site web search engines

Proceedings of the 18th ACM conference on Information and knowledge management
A refreshing perspective of search engine caching

Proceedings of the 19th international conference on World wide web

Document assignment in multi-site search engines

Proceedings of the fourth ACM international conference on Web search and data mining
Indexing strategies for graceful degradation of search quality

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Energy-price-driven query processing in multi-center web search engines

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Assigning documents to master sites in distributed search

Proceedings of the 20th ACM international conference on Information and knowledge management
Reactive index replication for distributed search engines

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Shard ranking and cutoff estimation for topically partitioned collections

Proceedings of the 21st ACM international conference on Information and knowledge management
RESQ: rank-energy selective query forwarding for distributed search systems

Proceedings of the 21st ACM international conference on Information and knowledge management
Document replication strategies for geographically distributed web search engines

Information Processing and Management: an International Journal
A Comprehensive Study of Techniques for URL-Based Web Page Language Classification

ACM Transactions on the Web (TWEB)
Distributed information retrieval and applications

ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
Rank-energy selective query forwarding for distributed search systems

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Improving the efficiency of multi-site web search engines

Proceedings of the 7th ACM international conference on Web search and data mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Query forwarding is an important technique for preserving the result quality in distributed search engines where the index is geographically partitioned over multiple search sites. The key component in query forwarding is the thresholding algorithm by which the forwarding decisions are given. In this paper, we propose a linear-programming-based thresholding algorithm that significantly outperforms the current state-of-the-art in terms of achieved search efficiency values. Moreover, we evaluate a greedy heuristic for partial index replication and investigate the impact of result cache freshness on query forwarding performance. Finally, we present some optimizations that improve the performance further, under certain conditions. We evaluate the proposed techniques by simulations over a real-life setting, using a large query log and a document collection obtained from Yahoo!.