Performance Optimization for Large Scale Computing: The Scalable VAMPIR Approach
ICCS '01 Proceedings of the International Conference on Computational Science-Part II
Proceedings of the 2007 workshop on Service-oriented computing performance: aspects, issues, and approaches
Proceedings of the 2010 TeraGrid Conference
A study of lustre networking over a 100 gigabit wide area network with 50 milliseconds of latency
Proceedings of the fifth international workshop on Data-Intensive Distributed Computing Date
Demonstrating lustre over a 100Gbps wide area network of 3,500km
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Hi-index | 0.00 |
Astrophysical simulations of protoplanetary disks and gas giant planet formation are being performed with a variety of numerical methods. Some of the codes in use today have been producing scientifically significant results for several years, or even decades. Each must simulate millions of resolution elements for millions of time steps, capture and store output data, and rapidly and efficiently analyze this data. To do this effectively, a parallel code is needed that scales to tens or hundreds of processors. Furthermore, an efficient workflow for the transport, analysis, and interpretation of the output data is needed to achieve scientifically meaningful results. Since such simulations are usually performed on moderate to large parallel systems, the compute system is generally located at a remote institution. However, analysis of results is typically performed interactively, and due to the fact that most supercomputing centers do not offer dedicated interactive nodes, the transfer of simulation output data to local resources becomes necessary. Even if interactive resources were available, typical network latencies make X-forwarded displays nearly impossible to work with. Since data sets can be quite large and traditional transfer mechanisms such as scp and sftp offer relatively low throughput, this transfer of data sets becomes a bottleneck in the research workflow. In this article we measure the scalability of the Computational HYdronamics with MultiplE Radiation Algorithms (CHYMERA) code on the SGI Altix architecture. We find that it scales well up to 64 threads for moderate and large sized problems. We also present a novel approach to enable rapid transfer and analysis of simulation data via the Data Capacitor (DC) and Lustre WAN (Wide Area Network) [17]. The usage of a WAN file system to tie batch system operated compute resources and interactive analysis and visualization resources together is of general interest and can be applied broadly.