MixApart: decoupled analytics for shared storage systems

  • Authors:
  • Madalin Mihailescu;Gokul Soundararajan;Cristiana Amza

  • Affiliations:
  • University of Toronto;NetApp;University of Toronto

  • Venue:
  • HotStorage'12 Proceedings of the 4th USENIX conference on Hot Topics in Storage and File Systems
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data analytics and enterprise applications have very different storage functionality requirements. For this reason, enterprise deployments of data analytics are on a separate storage silo. This may generate additional costs and inefficiencies in data management, e.g., whenever data needs to be archived, copied, or migrated across silos. We introduce MixApart, a scalable data processing framework for shared enterprise storage systems. With MixApart, a single consolidated storage back-end manages enterprise data and services all types of workloads, thereby lowering hardware costs and simplifying data management. In addition, MixApart enables the local storage performance required by analytics through an integrated data caching and scheduling solution. Our preliminary evaluation shows that MixApart can be 45% faster than the traditional ingest-then-compute workflow used in enterprise IT analytics, while requiring one third of storage capacity when compared to HDFS.