Designing a DHT for low latency and high throughput

Authors:
Frank Dabek;Jinyang Li;Emil Sit;James Robertson;M. Frans Kaashoek;Robert Morris
Affiliations:
MIT Computer Science and Artificial Intelligence Laboratory;MIT Computer Science and Artificial Intelligence Laboratory;MIT Computer Science and Artificial Intelligence Laboratory;MIT Computer Science and Artificial Intelligence Laboratory;MIT Computer Science and Artificial Intelligence Laboratory;MIT Computer Science and Artificial Intelligence Laboratory
Venue:
NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
Year:
2004

Citing 28
Cited 71

Congestion avoidance and control

SIGCOMM '88 Symposium proceedings on Communications architectures and protocols
Efficient dispersal of information for security, load balancing, and fault tolerance

Journal of the ACM (JACM)
Analysis of the increase and decrease algorithms for congestion avoidance in computer networks

Computer Networks and ISDN Systems
A prototype implementation of archival Intermemory

Proceedings of the fourth ACM conference on Digital libraries
OceanStore: an architecture for global-scale persistent storage

ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Chord: A scalable peer-to-peer lookup service for internet applications

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
Resilient overlay networks

SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility

SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Wide-area cooperative storage with CFS

SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Squirrel: a decentralized peer-to-peer web cache

Proceedings of the twenty-first annual symposium on Principles of distributed computing
King: estimating latency between arbitrary internet end hosts

Proceedings of the 2nd ACM SIGCOMM Workshop on Internet measurment
Kademlia: A Peer-to-Peer Information System Based on the XOR Metric

IPTPS '01 Revised Papers from the First International Workshop on Peer-to-Peer Systems
Erasure Coding Vs. Replication: A Quantitative Comparison

IPTPS '01 Revised Papers from the First International Workshop on Peer-to-Peer Systems
Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems

Middleware '01 Proceedings of the IFIP/ACM International Conference on Distributed Systems Platforms Heidelberg
Mnemosyne: Peer-to-Peer Steganographic Storage

IPTPS '01 Revised Papers from the First International Workshop on Peer-to-Peer Systems
The impact of DHT routing geometry on resilience and proximity

Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications
Constructing internet coordinate system based on delay measurement

Proceedings of the 3rd ACM SIGCOMM conference on Internet measurement
PIC: Practical Internet Coordinates for Distance Estimation

ICDCS '04 Proceedings of the 24th International Conference on Distributed Computing Systems (ICDCS'04)
Farsite: federated, available, and reliable storage for an incompletely trusted environment

OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
Ivy: a read/write peer-to-peer file system

OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
Pastiche: making backup cheap and easy

OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
HiScamp: self-organizing hierarchical membership protocol

EW 10 Proceedings of the 10th workshop on ACM SIGOPS European workshop
High availability, scalable storage, dynamic peer networks: pick two

HOTOS'03 Proceedings of the 9th conference on Hot Topics in Operating Systems - Volume 9
One hop lookups for peer-to-peer overlays

HOTOS'03 Proceedings of the 9th conference on Hot Topics in Operating Systems - Volume 9
Untangling the web from DNS

NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
Pond: the oceanstore prototype

FAST'03 Proceedings of the 2nd USENIX conference on File and storage technologies
UsenetDHT: a low overhead usenet server

IPTPS'04 Proceedings of the Third international conference on Peer-to-Peer Systems
Tapestry: a resilient global-scale overlay for service deployment

IEEE Journal on Selected Areas in Communications

Object location in realistic networks

Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures
Vivaldi: a decentralized network coordinate system

Proceedings of the 2004 conference on Applications, technologies, architectures, and protocols for computer communications
Low traffic overlay networks with large routing tables

SIGMETRICS '05 Proceedings of the 2005 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Implementing declarative overlays

Proceedings of the twentieth ACM symposium on Operating systems principles
A need for componentized transport protocols

Proceedings of the twentieth ACM symposium on Operating systems principles
WAP5: black-box performance debugging for wide-area systems

Proceedings of the 15th international conference on World Wide Web
RASTER: A Light-Weight Routing Protocol to Discover Shortest Overlay Routes in Randomized DHT Systems

ICPADS '06 Proceedings of the 12th International Conference on Parallel and Distributed Systems - Volume 1
Survey of research towards robust peer-to-peer networks: search methods

Computer Networks: The International Journal of Computer and Telecommunications Networking
Towards a platform for wide-area overlay network deployment and management

Computer Networks: The International Journal of Computer and Telecommunications Networking
Maintaining high bandwidth under dynamic network conditions

ATEC '05 Proceedings of the annual conference on USENIX Annual Technical Conference
Handling churn in a DHT

ATEC '04 Proceedings of the annual conference on USENIX Annual Technical Conference
Bandwidth-efficient management of DHT routing tables

NSDI'05 Proceedings of the 2nd conference on Symposium on Networked Systems Design & Implementation - Volume 2
Fixing the embarrassing slowness of OpenDHT on PlanetLab

WORLDS'05 Proceedings of the 2nd conference on Real, Large Distributed Systems - Volume 2
Non-transitive connectivity and DHTs

WORLDS'05 Proceedings of the 2nd conference on Real, Large Distributed Systems - Volume 2
Ferry: A P2P-Based Architecture for Content-Based Publish/Subscribe Services

IEEE Transactions on Parallel and Distributed Systems
High-bandwidth data dissemination for large-scale distributed systems

ACM Transactions on Computer Systems (TOCS)
Insight into redundancy schemes in DHTs

The Journal of Supercomputing
Overlay Weaver: An overlay construction toolkit

Computer Communications
Stochastic analysis of the interplay between object maintenance and churn

Computer Communications
Practical large-scale latency estimation

Computer Networks: The International Journal of Computer and Telecommunications Networking
Near-optimal dynamic replication in unstructured peer-to-peer networks

Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
A survey on resource discovery mechanisms, peer-to-peer and service discovery frameworks

Computer Networks: The International Journal of Computer and Telecommunications Networking
UsenetDHT: a low-overhead design for Usenet

NSDI'08 Proceedings of the 5th USENIX Symposium on Networked Systems Design and Implementation
Friendstore: cooperative online backup using trusted nodes

Proceedings of the 1st Workshop on Social Network Systems
Distributed hash sketches: Scalable, efficient, and accurate cardinality estimation for distributed multisets

ACM Transactions on Computer Systems (TOCS)
Peer-exchange schemes to handle mismatch in peer-to-peer systems

The Journal of Supercomputing
ProtoPeer: a P2P toolkit bridging the gap between simulation and live deployement

Proceedings of the 2nd International Conference on Simulation Tools and Techniques
An adaptive latency mitigation scheme for massively multiuser virtual environments

Journal of Network and Computer Applications
Matchmaking for online games and other latency-sensitive P2P systems

Proceedings of the ACM SIGCOMM 2009 conference on Data communication
Exploring the Feasibility of Reputation Models for Improving P2P Routing under Churn

Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
A secure architecture for P2PSIP-based communication systems

Proceedings of the 2nd international conference on Security of information and networks
EpiChord: Parallelizing the Chord lookup algorithm with reactive routing state management

Computer Communications
Statistical structures for Internet-scale data management

The VLDB Journal — The International Journal on Very Large Data Bases
Churn-Resilient Replication Strategy for Peer-to-Peer Distributed Hash-Tables

SSS '09 Proceedings of the 11th International Symposium on Stabilization, Safety, and Security of Distributed Systems
Self-organized Data Redundancy Management for Peer-to-Peer Storage Systems

IWSOS '09 Proceedings of the 4th IFIP TC 6 International Workshop on Self-Organizing Systems
Handling very large numbers of messages in distributed hash tables

COMSNETS'09 Proceedings of the First international conference on COMmunication Systems And NETworks
A small-world DHT built on generalized network coordinates

Proceedings of the 2010 EDBT/ICDT Workshops
RACS: a case for cloud storage diversity

Proceedings of the 1st ACM symposium on Cloud computing
Session control cooperating core and overlay networks for "minimum core" architecture

GLOBECOM'09 Proceedings of the 28th IEEE conference on Global telecommunications
Do next generation networks need path diversity?

ICC'09 Proceedings of the 2009 IEEE international conference on Communications
PCIR: Combining DHTs and peer clusters for efficient full-text P2P indexing

Computer Networks: The International Journal of Computer and Telecommunications Networking
Veracity: a fully decentralized service for securing network coordinate systems

IPTPS'08 Proceedings of the 7th international conference on Peer-to-peer systems
Veracity: practical secure network coordinates via vote-based agreements

USENIX'09 Proceedings of the 2009 conference on USENIX Annual technical conference
Network coding for distributed storage systems

IEEE Transactions on Information Theory
Cost-bandwidth tradeoff in distributed storage systems

Computer Communications
Robust Decentralized Virtual Coordinate Systems in Adversarial Environments

ACM Transactions on Information and System Security (TISSEC)
A survey of DHT security techniques

ACM Computing Surveys (CSUR)
Data life time for different placement policies in P2P storage systems

Globe'10 Proceedings of the Third international conference on Data management in grid and peer-to-peer systems
Comet: an active distributed key-value store

OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
A quantitative analysis of redundancy schemes for peer-to- peer storage systems

SSS'10 Proceedings of the 12th international conference on Stabilization, safety, and security of distributed systems
The Frog-Boiling Attack: Limitations of Secure Network Coordinate Systems

ACM Transactions on Information and System Security (TISSEC)
Collaborative personalized top-k processing

ACM Transactions on Database Systems (TODS)
Combining resource and location awareness in DHTs

OTM'11 Proceedings of the 2011th Confederated international conference on On the move to meaningful internet systems - Volume Part I
Redundancy schemes for high availability in DHTs

ISPA'05 Proceedings of the Third international conference on Parallel and Distributed Processing and Applications
Efficient message flooding on DHT network

HPCC'05 Proceedings of the First international conference on High Performance Computing and Communications
Building a peer-to-peer overlay for efficient routing and low maintenance

EUC'05 Proceedings of the 2005 international conference on Embedded and Ubiquitous Computing
Arpeggio: metadata searching and content sharing with chord

IPTPS'05 Proceedings of the 4th international conference on Peer-to-Peer Systems
OverCite: a cooperative digital research library

IPTPS'05 Proceedings of the 4th international conference on Peer-to-Peer Systems
High availability in DHTs: erasure coding vs. replication

IPTPS'05 Proceedings of the 4th international conference on Peer-to-Peer Systems
Self-optimizing DHTs using request profiling

OPODIS'04 Proceedings of the 8th international conference on Principles of Distributed Systems
UsenetDHT: a low overhead usenet server

IPTPS'04 Proceedings of the Third international conference on Peer-to-Peer Systems
Improving sender anonymity in a structured overlay with imprecise routing

PET'06 Proceedings of the 6th international conference on Privacy Enhancing Technologies
Context dissemination and aggregation for ambient networks: jini based prototype

EuroSSC'06 Proceedings of the First European conference on Smart Sensing and Context
RelaxDHT: A churn-resilient replication strategy for peer-to-peer distributed hash-tables

ACM Transactions on Autonomous and Adaptive Systems (TAAS)
Efficient cooperative backup with decentralized trust management

ACM Transactions on Storage (TOS)
A scalable server for 3D metaverses

USENIX ATC'12 Proceedings of the 2012 USENIX conference on Annual Technical Conference
BitTorrent-like P2P approaches for VoD: A comparative study

Computer Networks: The International Journal of Computer and Telecommunications Networking
Towards practical communication in Byzantine-resistant DHTs

IEEE/ACM Transactions on Networking (TON)
TAO: Facebook's distributed data store for the social graph

USENIX ATC'13 Proceedings of the 2013 USENIX conference on Annual Technical Conference
The design and implementation of the A3 application-aware anonymity platform

Computer Networks: The International Journal of Computer and Telecommunications Networking
A gossip-based approach for Internet-scale cardinality estimation of XPath queries over distributed semistructured data

The VLDB Journal — The International Journal on Very Large Data Bases

Quantified Score

Hi-index	0.06

Visualization

Abstract

Designing a wide-area distributed hash table (DHT) that provides high-throughput and low-latency network storage is a challenge. Existing systems have explored a range of solutions, including iterative routing, recursive routing, proximity routing and neighbor selection, erasure coding, replication, and server selection. This paper explores the design of these techniques and their interaction in a complete system, drawing on the measured performance of a new DHT implementation and results from a simulator with an accurate Internet latency model. New techniques that resulted from this exploration include use of latency predictions based on synthetic co-ordinates, efficient integration of lookup routing and data fetching, and a congestion control mechanism suitable for fetching data striped over large numbers of servers. Measurements with 425 server instances running on 150 PlanetLab and RON hosts show that the latency optimizations reduce the time required to locate and fetch data by a factor of two. The throughput optimizations result in a sustainable bulk read throughput related to the number of DHT hosts times the capacity of the slowest access link; with 150 selected PlanetLab hosts, the peak aggregate throughput over multiple clients is 12.8 megabytes per second.