MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
Parallel hierarchical visualization of large time-varying 3D vector fields
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
A scalable parallel framework for analyzing terascale molecular dynamics simulation trajectories
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Towards Efficient MapReduce Using MPI
Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Terascale data organization for discovering multivariate climatic trends
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Scalable computation of streamlines on very large datasets
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
I/O performance challenges at leadership scale
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Using Subfiling to Improve Programming Flexibility and Performance of Parallel Shared-file I/O
ICPP '09 Proceedings of the 2009 International Conference on Parallel Processing
Pregel: a system for large-scale graph processing
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Managing Variability in the IO Performance of Petascale Storage Systems
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
A Study of Parallel Particle Tracing for Steady-State and Time-Varying Flow Fields
IPDPS '11 Proceedings of the 2011 IEEE International Parallel & Distributed Processing Symposium
Multi-GPU MapReduce on GPU Clusters
IPDPS '11 Proceedings of the 2011 IEEE International Parallel & Distributed Processing Symposium
Toward a General I/O Layer for Parallel-Visualization Applications
IEEE Computer Graphics and Applications
Parallel particle advection and FTLE computation for time-varying flow fields
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Hi-index | 0.00 |
Many data-intensive scientific analysis techniques require global domain traversal, which over the years has been a bottleneck for efficient parallelization across distributed-memory architectures. Inspired by MapReduce and other simplified parallel programming approaches, we have designed DStep, a flexible system that greatly simplifies efficient parallelization of domain traversal techniques at scale. In order to deliver both simplicity to users as well as scalability on HPC platforms, we introduce a novel two-tiered communication architecture for managing and exploiting asynchronous communication loads. We also integrate our design with advanced parallel I/O techniques that operate directly on native simulation output. We demonstrate DStep by performing teleconnection analysis across ensemble runs of terascale atmospheric CO2 and climate data, and we show scalability results on up to 65,536 IBM BlueGene/P cores.