DieCast: testing distributed systems with an accurate scale model

  • Authors:
  • Diwaker Gupta;Kashi V. Vishwanath;Amin Vahdat

  • Affiliations:
  • University of California, San Diego;University of California, San Diego;University of California, San Diego

  • Venue:
  • NSDI'08 Proceedings of the 5th USENIX Symposium on Networked Systems Design and Implementation
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Large-scale network services can consist of tens of thousands of machines running thousands of unique software configurations spread across hundreds of physical networks. Testing such services for complex performance problems and configuration errors remains a difficult problem. Existing testing techniques, such as simulation or running smaller instances of a service, have limitations in predicting overall service behavior. Although technically and economically infeasible at this time, testing should ideally be performed at the same scale and with the same configuration as the deployed service. We present DieCast, an approach to scaling network services in which we multiplex all of the nodes in a given service configuration as virtual machines (VM) spread across a much smaller number of physical machines in a test harness. CPU, network, and disk are then accurately scaled to provide the illusion that each VM matches a machine from the original service in terms of both available computing resources and communication behavior to remote service nodes. We present the architecture and evaluation of a system to support such experimentation and discuss its limitations. We show that for a variety of services--including a commercial, high-performance, cluster-based file system--and resource utilization levels, DieCast matches the behavior of the original service while using a fraction of the physical resources.