GHOST: GPGPU-offloaded high performance storage I/O deduplication for primary storage system

Authors:
Chulmin Kim;Ki-Woong Park;Kyu Ho Park
Affiliations:
KAIST, Daejeon, South Korea;KAIST, Daejeon, South Korea;KAIST, Daejeon, South Korea
Venue:
Proceedings of the 2012 International Workshop on Programming Models and Applications for Multicores and Manycores
Year:
2012

Citing 19
Cited 0

Venti: A New Approach to Archival Storage

FAST '02 Proceedings of the Conference on File and Storage Technologies
The Google file system

SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
WOW: wise ordering for writes - combining spatial and temporal locality in non-volatile caches

FAST'05 Proceedings of the 4th conference on USENIX Conference on File and Storage Technologies - Volume 4
Disk failures in the real world: what does an MTTF of 1,000,000 hours mean to you?

FAST '07 Proceedings of the 5th USENIX conference on File and Storage Technologies
Dynamo: amazon's highly available key-value store

Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Avoiding the disk bottleneck in the data domain deduplication file system

FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
Prefetching with adaptive cache culling for striped disk arrays

ATC'08 USENIX 2008 Annual Technical Conference on Annual Technical Conference
Sparse indexing: large scale, inline deduplication using sampling and locality

FAST '09 Proccedings of the 7th conference on File and storage technologies
Windows Azure Platform

Windows Azure Platform
Cryptanalysis of the tiger hash function

ASIACRYPT'07 Proceedings of the Advances in Crypotology 13th international conference on Theory and application of cryptology and information security
PacketShader: a GPU-accelerated software router

Proceedings of the ACM SIGCOMM 2010 conference
A GPU accelerated storage system

Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
STOW: a spatially and temporally optimized write caching algorithm

USENIX'09 Proceedings of the 2009 conference on USENIX Annual technical conference
ChunkStash: speeding up inline storage deduplication using flash memory

USENIXATC'10 Proceedings of the 2010 USENIX conference on USENIX annual technical conference
MN-Mate: Resource Management of Manycores with DRAM and Nonvolatile Memories

HPCC '10 Proceedings of the 2010 IEEE 12th International Conference on High Performance Computing and Communications
The Hadoop Distributed File System

MSST '10 Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST)
A study of practical deduplication

FAST'11 Proceedings of the 9th USENIX conference on File and stroage technologies
A fast approach for parallel deduplication on multicore processors

Proceedings of the 2011 ACM Symposium on Applied Computing
Multithread Content Based File Chunking System in CPU-GPGPU Heterogeneous Architecture

CCP '11 Proceedings of the 2011 First International Conference on Data Compression, Communications and Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data deduplication has been an effective way to eliminate redundant data mainly for backup storage systems. Since the recent primary storage systems in cloud services are expected to have the redundancy, the deduplication technique can also bring significant cost saving for the primary storage. However, the primary storage system requires high performance requirement about several GBs per second. Most conventional deduplication techniques targeted the performance requirement of 200-300MB/s. In an attempt to achieve a high performance storage deduplication system at the primary storage, we thoroughly analyze the performance bottleneck of previous deduplication systems to enhance the system to meet the requirement of the primary storage. The new performance bottleneck of deduplication in the primary storage lies on not only key-value store lookup, also computation for data segmentation and fingerprinting due to recent technology improvement of flash devices such as SSD. To overcome the bottlenecks, we propose a new deduplication system utilizing GPGPU. Our proposed system, termed GHOST, includes the followings to offload and optimize the deduplication processing in GPGPU: (1) In-Host Data Cache, (2) Destage-aware Data offloading to GPGPU and (3) In-GPGPU Table Cache of key-value store. These techniques improve the offloaded deduplication processing about 10-20% on the reasonable workload of the primary storage compared to the naive approach. Our proposed deduplication system can achieve 1.5GB/s in maximum which is about 5 times of the deduplication systems used CPU only.