Designing a DHT for low latency and high throughput

  • Authors:
  • Frank Dabek;Jinyang Li;Emil Sit;James Robertson;M. Frans Kaashoek;Robert Morris

  • Affiliations:
  • MIT Computer Science and Artificial Intelligence Laboratory;MIT Computer Science and Artificial Intelligence Laboratory;MIT Computer Science and Artificial Intelligence Laboratory;MIT Computer Science and Artificial Intelligence Laboratory;MIT Computer Science and Artificial Intelligence Laboratory;MIT Computer Science and Artificial Intelligence Laboratory

  • Venue:
  • NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
  • Year:
  • 2004

Quantified Score

Hi-index 0.06

Visualization

Abstract

Designing a wide-area distributed hash table (DHT) that provides high-throughput and low-latency network storage is a challenge. Existing systems have explored a range of solutions, including iterative routing, recursive routing, proximity routing and neighbor selection, erasure coding, replication, and server selection. This paper explores the design of these techniques and their interaction in a complete system, drawing on the measured performance of a new DHT implementation and results from a simulator with an accurate Internet latency model. New techniques that resulted from this exploration include use of latency predictions based on synthetic co-ordinates, efficient integration of lookup routing and data fetching, and a congestion control mechanism suitable for fetching data striped over large numbers of servers. Measurements with 425 server instances running on 150 PlanetLab and RON hosts show that the latency optimizations reduce the time required to locate and fetch data by a factor of two. The throughput optimizations result in a sustainable bulk read throughput related to the number of DHT hosts times the capacity of the slowest access link; with 150 selected PlanetLab hosts, the peak aggregate throughput over multiple clients is 12.8 megabytes per second.