Reducing Query Latencies in Web Search Using Fine-Grained Parallelism

Authors:
Eitan Frachtenberg
Affiliations:
Microsoft, San Francisco, USA 94107
Venue:
World Wide Web
Year:
2009

Citing 18
Cited 3

Reevaluating Amdahl's law

Communications of the ACM
The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Retrieval performance of a distributed text database utilizing a parallel processor document server

DPDS '90 Proceedings of the second international symposium on Databases in parallel and distributed systems
Clock rate versus IPC: the end of the road for conventional microarchitectures

Proceedings of the 27th annual international symposium on Computer architecture
A scalable and highly available web server

COMPCON '96 Proceedings of the 41st IEEE International Computer Conference
Parallel Search using Partitioned Inverted Files

SPIRE '00 Proceedings of the Seventh International Symposium on String Processing Information Retrieval (SPIRE'00)
Constructing Consensus Ontologies for the Semantic Web: A Conceptual Approach

World Wide Web
Swoogle: a search and metadata engine for the semantic web

Proceedings of the thirteenth ACM international conference on Information and knowledge management
System noise, OS clock ticks, and fine-grained parallel applications

Proceedings of the 19th annual international conference on Supercomputing
Inverted files for text search engines

ACM Computing Surveys (CSUR)
Beyond PageRank: machine learning for static ranking

Proceedings of the 15th international conference on World Wide Web
A Novel Context-based Technique for Web Information Retrieval

World Wide Web
Algorithmic Computation and Approximation of Semantic Similarity

World Wide Web
MapReduce: simplified data processing on large clusters

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
The impact of caching on search engines

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Larrabee: a many-core x86 architecture for visual computing

ACM SIGGRAPH 2008 papers
Validity of the single processor approach to achieving large scale computing capabilities

AFIPS '67 (Spring) Proceedings of the April 18-20, 1967, spring joint computer conference
Scalable Web server clustering technologies

IEEE Network: The Magazine of Global Internetworking

Posting list intersection on multicore architectures

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Exploring GPU architectures to accelerate semantic comparison for intention-based search

Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units
Adaptive parallelism for web search

Proceedings of the 8th ACM European Conference on Computer Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Semantic Web search is a new application of recent advances in information retrieval (IR), natural language processing, artificial intelligence, and other fields. The Powerset group in Microsoft develops a semantic search engine that aims to answer queries not only by matching keywords, but by actually matching meaning in queries to meaning in Web documents. Compared to typical keyword search, semantic search can pose additional engineering challenges for the back-end and infrastructure designs. Of these, the main challenge addressed in this paper is how to lower query latencies to acceptable, interactive levels. Index-based semantic search requires more data processing, such as numerous synonyms, hypernyms, multiple linguistic readings, and other semantic information, both on queries and in the index. In addition, some of the algorithms can be super-linear, such as matching co-references across a document. Consequently, many semantic queries can run significantly slower than the same keyword query. Users, however, have grown to expect Web search engines to provide near-instantaneous results, and a slow search engine could be deemed unusable even if it provides highly relevant results. It is therefore imperative for any search engine to meet its users' interactivity expectations, or risk losing them. Our approach to tackle this challenge is to exploit data parallelism in slow search queries to reduce their latency in multi-core systems. Although all search engines are designed to exploit parallelism, at the single-node level this usually translates to throughput-oriented task parallelism. This paper focuses on the engineering of two latency-oriented approaches (coarse- and fine-grained) and compares them to the task-parallel approach. We use Powerset's deployed search engine to evaluate the various factors that affect parallel performance: workload, overhead, load balancing, and resource contention. We also discuss heuristics to selectively control the degree of parallelism and consequent overhead on a query-by-query level. Our experimental results show that using fine-grained parallelism with these dynamic heuristics can significantly reduce query latencies compared to fixed, coarse-granularity parallelization schemes. Although these results were obtained on, and optimized for, Powerset's semantic search, they can be readily generalized to a wide class of inverted-index search engines.