Tight bounds for minimax grid matching, with applications to the average case analysis of algorithms
STOC '86 Proceedings of the eighteenth annual ACM symposium on Theory of computing
A bridging model for parallel computation
Communications of the ACM
Approximation algorithms for bin packing: a survey
Approximation algorithms for NP-hard problems
A Probabilistic Analysis of the LPT Scheduling Rule
Performance '84 Proceedings of the Tenth International Symposium on Computer Performance Modelling, Measurement and Evaluation
Some unexpected expected behavior results for bin packing
STOC '84 Proceedings of the sixteenth annual ACM symposium on Theory of computing
Inverted files for text search engines
ACM Computing Surveys (CSUR)
Load balancing for term-distributed parallel retrieval
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Scheduling Intersection Queries in Term Partitioned Inverted Files
Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
Information retrieval from digital libraries in SQL
Proceedings of the 10th ACM workshop on Web information and data management
A combined semi-pipelined query processing architecture for distributed full-text retrieval
WISE'10 Proceedings of the 11th international conference on Web information systems engineering
Hi-index | 0.01 |
This paper present a comparison of scheduling algorithms applied to the context of load balancing the query traffic on distributed inverted files. We implemented a number of algorithms taken from the literature. We propose a novel method to formulate the cost of query processing so that these algorithms can be used to schedule queries onto processors. We avoid measuring load balance at the search engine side because this can lead to imprecise evaluation. Our method is based on the simulation of a bulk-synchronous parallel computer at the broker machine side. This simulation determines an optimal way of processing the queries and provides a stable baseline upon which both the broker and search engine can tune their operation in accordance with the observed query traffic. We conclude that the simplest load balancing heuristics are good enough to achieve efficient performance. Our method can be used in practice by broker machines to schedule queries efficiently onto the cluster processors of search engines.