Live deduplication storage of virtual machine images in an open-source cloud

  • Authors:
  • Chun-Ho Ng;Mingcao Ma;Tsz-Yeung Wong;Patrick P. C. Lee;John C. S. Lui

  • Affiliations:
  • Dept. of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong;Dept. of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong;Dept. of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong;Dept. of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong;Dept. of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong

  • Venue:
  • Middleware'11 Proceedings of the 12th ACM/IFIP/USENIX international conference on Middleware
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Deduplication is an approach of avoiding storing data blocks with identical content, and has been shown to effectively reduce the disk space for storing multi-gigabyte virtual machine (VM) images. However, it remains challenging to deploy deduplication in a real system, such as a cloud platform, where VM images are regularly inserted and retrieved. We propose LiveDFS, a live deduplication file system that enables deduplication storage of VM images in an open-source cloud that is deployed under low-cost commodity hardware settings with limited memory footprints. LiveDFS has several distinct features, including spatial locality, prefetching of metadata, and journaling. LiveDFS is POSIX-compliant and is implemented as a Linux kernel-space file system. We deploy our LiveDFS prototype as a storage layer in a cloud platform based on OpenStack, and conduct extensive experiments. Compared to an ordinary file system without deduplication, we show that LiveDFS can save at least 40% of space for storing VM images, while achieving reasonable performance in importing and retrieving VM images. Our work justifies the feasibility of deploying LiveDFS in an open-source cloud.