Automatic management of partitioned, replicated search services

Authors:
Florian Leibert;Jake Mannix;Jimmy Lin;Babak Hamadani
Affiliations:
Twitter, San Francisco, California;Twitter, San Francisco, California;Twitter, San Francisco, California;Twitter, San Francisco, California
Venue:
Proceedings of the 2nd ACM Symposium on Cloud Computing
Year:
2011

Citing 14
Cited 2

Parallel database systems: the future of high performance database systems

Communications of the ACM
The part-time parliament

ACM Transactions on Computer Systems (TOCS)
Web Search for a Planet: The Google Cluster Architecture

IEEE Micro
Load balancing for term-distributed parallel retrieval

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
MapReduce: simplified data processing on large clusters

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
The impact of caching on search engines

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
On designing and deploying internet-scale services

LISA'07 Proceedings of the 21st conference on Large Installation System Administration Conference
Pig latin: a not-so-foreign language for data processing

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
ResIn: a combination of results caching and index pruning for high-performance web search engines
Introduction to Information Retrieval

Introduction to Information Retrieval
The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines

The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines
ZooKeeper: wait-free coordination for internet-scale systems

USENIXATC'10 Proceedings of the 2010 USENIX conference on USENIX annual technical conference
Information Retrieval: Implementing and Evaluating Search Engines

Information Retrieval: Implementing and Evaluating Search Engines
Availability in globally distributed storage systems

OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation

Fast data in the era of big data: Twitter's real-time related query suggestion architecture

Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
WTF: the who to follow service at Twitter

Proceedings of the 22nd international conference on World Wide Web

Quantified Score

Hi-index	0.00

Visualization

Abstract

Low-latency, high-throughput web services are typically achieved through partitioning, replication, and caching. Although these strategies and the general design of large-scale distributed search systems are well known, the academic literature provides surprisingly few details on deployment and operational considerations in production environments. In this paper, we address this gap by sharing the distributed search architecture that underlies Twitter user search, a service for discovering relevant accounts on the popular microblogging service. Our design makes use of the principle that eliminates the distinction between failure and other anticipated service disruptions: as a result, most operational scenarios share exactly the same code path. This simplicity leads to greater robustness and fault-tolerance. Another salient feature of our architecture is its exclusive reliance on open-source software components, which makes it easier for the community to learn from our experiences and replicate our findings.