Storage provisioning and allocation in a large cloud environment

  • Authors:
  • Murray Stokely;Arif Merchant

  • Affiliations:
  • Google, Inc., Mountain View, USA;Google, Inc., Mountain View, USA

  • Venue:
  • Proceedings of the 2012 workshop on Management of big data systems
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Provisioning scarce resources among competing users and jobs remains one of the primary challenges of operating large-scale, distributed computing environments. Distributed storage systems, in particular, typically rely on hard operator-set quotas to control disk allocation and enforce isolation for space and I/O bandwidth among disparate users. In [7], we set up an experimental market-based system to run auctions and send clear price signals to storage users to encourage a more balanced allocation of storage usage against other resource dimensions. The final resource allocations illustrated how a market mechanism can lead to significant, beneficial changes in user behavior. This mechanism im- proved resource allocations for current user demands, but still suffered from the fact that users and operators are very poor at predicting future requirements and, as a result, tend to over-provision grossly. In [5], we attempted to address this by collecting detailed usage information for multiple years and employing the use of ensemble forecasting methods to produce predictions of future resource needs. Specifically, we measured the disk space usage, I/O rate, and age of stored data for thousands of different engineering users and teams in a large private cloud spanning dozens of clusters on multiple continents. We found that although the individual time series often have non-stable usage trends, regional aggregations, user classification, and ensemble forecasting methods can be combined to provide a more accurate prediction of future use for the majority of users. Both of these approaches demonstrated the potential to improve the accuracy of our demand forecasts and capacity planning process, but significant operational challenges remain.