Structuring PLFS for extensibility

Authors:
Chuck Cranor;Milo Polte;Garth Gibson
Affiliations:
Carnegie Mellon University, Pittsburgh, PA;WibiData, Inc., San Francisco CA;Carnegie Mellon University, Pittsburgh, PA
Venue:
PDSW '13 Proceedings of the 8th Parallel Data Storage Workshop
Year:
2013

Citing 18
Cited 0

GPFS: A Shared-Disk File System for Large Computing Clusters

FAST '02 Proceedings of the Conference on File and Storage Technologies
An Abstract-Device Interface for Implementing Portable Parallel-I/O Interfaces

FRONTIERS '96 Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation
The Google file system

SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
The Panasas ActiveScale Storage Cluster: Delivering Scalable High Bandwidth Storage

Proceedings of the 2004 ACM/IEEE conference on Supercomputing
Exploiting Lustre File Joining for Effective Collective IO

CCGRID '07 Proceedings of the Seventh IEEE International Symposium on Cluster Computing and the Grid
Bigtable: a distributed storage system for structured data

OSDI '06 Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7
Scalable performance of the Panasas parallel file system

FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
PLFS: a checkpoint filesystem for parallel applications

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
I/O performance challenges at leadership scale

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
...and eat it too: high read performance in write-optimized HPC I/O middleware file formats

Proceedings of the 4th Annual Workshop on Petascale Data Storage
Cloud analytics: do we really need to reinvent the storage stack?

HotCloud'09 Proceedings of the 2009 conference on Hot topics in cloud computing
Optimization Techniques at the I/O Forwarding Layer

CLUSTER '10 Proceedings of the 2010 IEEE International Conference on Cluster Computing
The Hadoop Distributed File System

MSST '10 Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST)
Integrated in-system storage architecture for high performance computing

Proceedings of the 2nd International Workshop on Runtime and Operating Systems for Supercomputers
LDPLFS: Improving I/O Performance without Application Modification

IPDPSW '12 Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum
The Power and Challenges of Transformative I/O

CLUSTER '12 Proceedings of the 2012 IEEE International Conference on Cluster Computing
I/O acceleration with pattern detection

Proceedings of the 22nd international symposium on High-performance parallel and distributed computing
Discovering Structure in Unstructured I/O

SCC '12 Proceedings of the 2012 SC Companion: High Performance Computing, Networking Storage and Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

The Parallel Log Structured Filesystem (PLFS) [5] was designed to transparently transform highly concurrent, massive high-performance computing (HPC) N-to-1 checkpoint workloads into N-to-N workloads to avoid single-file performance bottlenecks in typical HPC distributed filesystems. PLFS has produced speedups of 2-150X for N-1 workloads at Los Alamos National Lab. Having successfully improved N-1 performance, we have restructured PLFS for extensibility so that it can be applied to more workloads and storage systems. In this paper we describe PLFS' evolution from a single-purpose log-structured middleware filesystem into a more general platform for transparently translating application I/O patterns. As an example of this extensibility, we show how PLFS can now be used to enable HPC applications to perform N-1 checkpoints on an HDFS-based cloud storage system.