Census: location-aware membership management for large-scale distributed systems

  • Authors:
  • James Cowling;Dan R. K. Ports;Barbara Liskov;Raluca Ada Popa;Abhijeet Gaikwad

  • Affiliations:
  • MIT CSAIL;MIT CSAIL;MIT CSAIL;MIT CSAIL;École Centrale Paris

  • Venue:
  • USENIX'09 Proceedings of the 2009 conference on USENIX Annual technical conference
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present Census, a platform for building large-scale distributed applications. Census provides a membership service and a multicast mechanism. The membership service provides every node with a consistent view of the system membership, which may be global or partitioned into location-based regions. Census distributes membership updates with low overhead, propagates changes promptly, and is resilient to both crashes and Byzantine failures. We believe that Census is the first system to provide a consistent membership abstraction at very large scale, greatly simplifying the design of applications built atop large deployments such as multi-site data centers. Census builds on a novel multicast mechanism that is closely integrated with the membership service. It organizes nodes into a reliable overlay composed of multiple distribution trees, using network coordinates to minimize latency. Unlike other multicast systems, it avoids the cost of using distributed algorithms to construct and maintain trees. Instead, each node independently produces the same trees from the consistent membership view. Census uses this multicast mechanism to distribute membership updates, along with application-provided messages. We evaluate the platform under simulation and on a real-world deployment on PlanetLab. We find that it imposes minimal bandwidth overhead, is able to react quickly to node failures and changes in the system membership, and can scale to substantial size.