ROAR: increasing the flexibility and performance of distributed search

Authors:
Costin Raiciu;Felipe Huici;Mark Handley;David S. Rosenblum
Affiliations:
University College London, London, United Kingdom;NEC Europe Ltd., Heidelberg, Germany;University College London, London, United Kingdom;University College London, London, United Kingdom
Venue:
Proceedings of the ACM SIGCOMM 2009 conference on Data communication
Year:
2009

Citing 18
Cited 6

Distributing a search tree among a growing number of processors

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Cluster I/O with River: making the fast case common

Proceedings of the sixth workshop on I/O in parallel and distributed systems
Chord: A scalable peer-to-peer lookup service for internet applications

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
The Power of Two Choices in Randomized Load Balancing

IEEE Transactions on Parallel and Distributed Systems
Encapsulation of Parallelism and Architecture-Independence in Extensible Database Query Execution

IEEE Transactions on Software Engineering
Mariposa: a wide-area distributed database system

The VLDB Journal — The International Journal on Very Large Data Bases
pSearch: information retrieval in structured overlays

ACM SIGCOMM Computer Communication Review
Web Search for a Planet: The Google Cluster Architecture

IEEE Micro
Peer-to-peer information retrieval using self-organizing semantic overlay networks

Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications
The Google file system

SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Mercury: supporting scalable multi-attribute range queries

Proceedings of the 2004 conference on Applications, technologies, architectures, and protocols for computer communications
Search with Probabilistic Guarantees in Unstructured Peer-to-Peer Networks

P2P '05 Proceedings of the Fifth IEEE International Conference on Peer-to-Peer Computing
Explicit control a batch-aware distributed file system

NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
MapReduce: simplified data processing on large clusters

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Bubblestorm: resilient, probabilistic, and exhaustive peer-to-peer search

Proceedings of the 2007 conference on Applications, technologies, architectures, and protocols for computer communications
Tuple routing strategies for distributed eddies

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Challenges in building large-scale information retrieval systems: invited talk

Proceedings of the Second ACM International Conference on Web Search and Data Mining
Design and evaluation of a distributed scalable content discovery system

IEEE Journal on Selected Areas in Communications

Probably Approximately Correct Search

ICTIR '09 Proceedings of the 2nd International Conference on Theory of Information Retrieval: Advances in Information Retrieval Theory
The impact of virtualization on network performance of amazon EC2 data center

INFOCOM'10 Proceedings of the 29th conference on Information communications
Distributed SQL queries with BubbleStorm

From active data management to event-based systems and more
Improving query correctness using centralized probably approximately correct (PAC) search

ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
Load Balancing Query Processing in Metric-Space Similarity Search

CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
A fault-tolerant cache service for web search engines: RADIC evaluation

Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

To search the web quickly, search engines partition the web index over many machines, and consult every partition when answering a query. To increase throughput, replicas are added for each of these machines. The key parameter of these algorithms is the trade-off between replication and partitioning: increasing the partitioning level improves query completion time since more servers handle the query, but may incur non-negligible startup costs for each sub-query. Finding the right operating point and adapting to it can significantly improve performance and reduce costs. We introduce Rendezvous On a Ring (ROAR), a novel distributed algorithm that enables on-the-fly re-configuration of the partitioning level. ROAR can add and remove servers without stopping the system, cope with server failures, and provide good load-balancing even with a heterogeneous server pool. We demonstrate these claims using a privacy-preserving search application built upon ROAR.