Accelerating parallel analysis of scientific simulation data via Zazen

Authors:
Tiankai Tu;Charles A. Rendleman;Patrick J. Miller;Federico Sacerdoti;Ron O. Dror;David E. Shaw
Affiliations:
D. E. Shaw Research, New York, NY;D. E. Shaw Research, New York, NY;D. E. Shaw Research, New York, NY;D. E. Shaw Research, New York, NY;D. E. Shaw Research, New York, NY;D. E. Shaw Research, New York, NY and Center for Computational Biology and Bioinformatics, Columbia University, New York, NY
Venue:
FAST'10 Proceedings of the 8th USENIX conference on File and storage technologies
Year:
2010

Citing 30
Cited 4

Scale and performance in a distributed file system

ACM Transactions on Computer Systems (TOCS)
Coda: A Highly Available File System for a Distributed Workstation Environment

IEEE Transactions on Computers
Remote I/O: fast access to distant storage

Proceedings of the fifth workshop on I/O in parallel and distributed systems
Efficient Algorithms for All-to-All Communications in Multiport Message-Passing Systems

IEEE Transactions on Parallel and Distributed Systems
The ITC distributed file system: principles and design

Proceedings of the tenth ACM symposium on Operating systems principles
Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility

SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Wide-area cooperative storage with CFS

SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Looking up data in P2P systems

Communications of the ACM
Visualization of Large Data Sets with the Active Data Repository

IEEE Computer Graphics and Applications
GPFS: A Shared-Disk File System for Large Computing Clusters

FAST '02 Proceedings of the Conference on File and Storage Technologies
An Abstract-Device Interface for Implementing Portable Parallel-I/O Interfaces

FRONTIERS '96 Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation
The LOCUS distributed operating system

SOSP '83 Proceedings of the ninth ACM symposium on Operating systems principles
A Network-Aware Distributed Storage Cache for Data Intensive Environments

HPDC '99 Proceedings of the 8th IEEE International Symposium on High Performance Distributed Computing
Grid Datafarm Architecture for Petascale Data Intensive Computing

CCGRID '02 Proceedings of the 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid
The Google file system

SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Kosha: A Peer-to-Peer Enhancement for the Network File System

Proceedings of the 2004 ACM/IEEE conference on Supercomputing
Ivy: a read/write peer-to-peer file system

OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
Awarded Best Student Paper! - Pond: The OceanStore Prototype

FAST '03 Proceedings of the 2nd USENIX Conference on File and Storage Technologies
FreeLoader: Scavenging Desktop Storage Resources for Scientific Data

SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Explicit control a batch-aware distributed file system

NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
PVFS: a parallel file system for linux clusters

ALS'00 Proceedings of the 4th annual Linux Showcase & Conference - Volume 4
Detection of Mutual Inconsistency in Distributed Systems

IEEE Transactions on Software Engineering
MapReduce: simplified data processing on large clusters

Communications of the ACM - 50th anniversary issue: 1958 - 2008
Panache: a parallel WAN cache for clustered filesystems

ACM SIGOPS Operating Systems Review
Flexible IO and integration for scientific codes through the adaptable IO system (ADIOS)

CLADE '08 Proceedings of the 6th international workshop on Challenges of large applications in distributed environments
A scalable parallel framework for analyzing terascale molecular dynamics simulation trajectories

Proceedings of the 2008 ACM/IEEE conference on Supercomputing
/scratch as a cache: rethinking HPC center scratch storage

Proceedings of the 23rd international conference on Supercomputing
Tashi: location-aware cluster management

ACDC '09 Proceedings of the 1st workshop on Automated control for datacenters and clouds
PLFS: a checkpoint filesystem for parallel applications

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Millisecond-scale molecular dynamics simulations on Anton

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis

Minimum density RAID-6 codes

ACM Transactions on Storage (TOS)
Just in time: adding value to the IO pipelines of high performance applications with JITStaging

Proceedings of the 20th international symposium on High performance distributed computing
Six degrees of scientific data: reading patterns for extreme scale science IO

Proceedings of the 20th international symposium on High performance distributed computing
Octopus: efficient data intensive computing on virtualized datacenters

Proceedings of the 6th International Systems and Storage Conference

Quantified Score

Hi-index	0.00

Visualization

Abstract

As a new generation of parallel supercomputers enables researchers to conduct scientific simulations of unprecedented scale and resolution, terabyte-scale simulation output has become increasingly commonplace. Analysis of such massive data sets is typically I/O-bound: many parallel analysis programs spend most of their execution time reading data from disk rather than performing useful computation. To overcome this I/O bottleneck, we have developed a new data access method. Our main idea is to cache a copy of simulation output files on the local disks of an analysis cluster's compute nodes, and to use a novel task-assignment protocol to co-locate data access with computation. We have implemented our methodology in a parallel disk cache system called Zazen. By avoiding the overhead associated with querying metadata servers and by reading data in parallel from local disks, Zazen is able to deliver a sustained read bandwidth of over 20 gigabytes per second on a commodity Linux cluster with 100 nodes, approaching the optimal aggregated I/O bandwidth attainable on these nodes. Compared with conventional NFS, PVFS2, and Hadoop/HDFS, respectively, Zazen is 75, 18, and 6 times faster for accessing large (1-GB) files, and 25, 13, and 85 times faster for accessing small (2-MB) files. We have deployed Zazen in conjunction with Anton--a special-purpose supercomputer that dramatically accelerates molecular dynamics (MD) simulations-- and have been able to accelerate the parallel analysis of terabyte-scale MD trajectories by about an order of magnitude.