Addressing the petascale data challenge using in-situ analytics
Proceedings of the 2nd international workshop on Petascal data analytics: challenges and opportunities
Combining in-situ and in-transit processing to enable extreme-scale scientific analysis
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Hi-index | 0.00 |
Emerging scientific application workflows are composed of heterogeneous coupled component applications that simulate different aspects of the physical phenomena being modeled, and that interact and exchange significant volumes of data at runtime. With the increasing performance gap between on-chip data sharing and off-chip data transfers in current systems based on multicore processors, moving large volumes of data using communication network fabric can significantly impact performance. As a result, minimizing the amount of inter-application data exchanges that are across compute nodes and use the network is critical to achieving overall application performance and system efficiency. In this paper, we investigate the in-situ execution of the coupled components of a scientific application workflow so as to maximize on-chip exchange of data. Specifically, we present a distributed data sharing and task execution framework that (1) employs data-centric task placement to map computations from the coupled applications onto processor cores so that a large portion of the data exchanges can be performed using the intra-node shared memory, (2) provides a shared space programming abstraction that supplements existing parallel programming models (e.g., message passing) with specialized one-sided asynchronous data access operators and can be used to express coordination and data exchanges between the coupled components. We also present the implementation of the framework and its experimental evaluation on the Jaguar Cray XT5 at Oak Ridge National Laboratory.