External memory algorithms and data structures: dealing with massive data
ACM Computing Surveys (CSUR)
Parallelizing Molecular Dynamics Programs for Distributed-Memory Machines
IEEE Computational Science & Engineering
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Protein Explorer: A Petaflops Special-Purpose Computer System for Molecular Dynamics Simulations
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Scalable algorithms for molecular dynamics simulations on commodity clusters
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Anton, a special-purpose machine for molecular dynamics simulation
Proceedings of the 34th annual international symposium on Computer architecture
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
PVFS: a parallel file system for linux clusters
ALS'00 Proceedings of the 4th annual Linux Showcase & Conference - Volume 4
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
Blue matter: scaling of N-body simulations to one atom per node
IBM Journal of Research and Development
Entering the petaflop era: the architecture and performance of Roadrunner
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Accelerating parallel analysis of scientific simulation data via Zazen
FAST'10 Proceedings of the 8th USENIX conference on File and storage technologies
Simplified parallel domain traversal
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
MapReduce in MPI for Large-scale graph algorithms
Parallel Computing
Computers in Biology and Medicine
Performance comparison under failures of MPI and MapReduce: An analytical approach
Future Generation Computer Systems
Hi-index | 0.00 |
As parallel algorithms and architectures drive the longest molecular dynamics (MD) simulations towards the millisecond scale, traditional sequential post-simulation data analysis methods are becoming increasingly untenable. Inspired by the programming interface of Google's MapReduce, we have built a new parallel analysis framework called HiMach, which allows users to write trajectory analysis programs sequentially, and carries out the parallel execution of the programs automatically. We introduce (1) a new MD trajectory data analysis model that is amenable to parallel processing, (2) a new interface for defining trajectories to be analyzed, (3) a novel method to make use of an existing sequential analysis tool called VMD, and (4) an extension to the original MapReduce model to support multiple rounds of analysis. Performance evaluations on up to 512 cores demonstrate the efficiency and scalability of the HiMach framework on a Linux cluster.