Journal of Parallel and Distributed Computing - Special issue on dynamic load balancing
Large scale parallel structured AMR calculations using the SAMRAI framework
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Dynamic Load Balancing for Structured Adaptive Mesh Refinement Applications
ICPP '02 Proceedings of the 2001 International Conference on Parallel Processing
libMesh: a C++ library for parallel adaptive mesh refinement/coarsening simulations
Engineering with Computers
Dendro: parallel algorithms for multigrid and AMR methods on 2:1 balanced octrees
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Radiation modeling using the Uintah heterogeneous CPU/GPU runtime system
Proceedings of the 1st Conference of the Extreme Science and Engineering Discovery Environment: Bridging from the eXtreme to the campus and beyond
Directionally unsplit hydrodynamic schemes with hybrid MPI/OpenMP/GPU parallelization in AMR
International Journal of High Performance Computing Applications
Hi-index | 0.00 |
Adaptive Mesh Refinement is a method which dynamically varies the spatio-temporal resolution of localized mesh regions in numerical simulations, based on the strength of the solution features. In-situ visualization plays an important role for analyzing the time evolving characteristics of the domain structures. Continuous visualization of the output data for various timesteps results in a better study of the underlying domain and the model used for simulating the domain. In this paper, we develop strategies for continuous online visualization of time evolving data for AMR applications executed on GPUs. We reorder the meshes for computations on the GPU based on the users input related to the subdomain that he wants to visualize. This makes the data available for visualization at a faster rate. We then perform asynchronous executions of the visualization steps and fix-up operations on the CPUs while the GPU advances the solution. By performing experiments on Tesla S1070 and Fermi C2070 clusters, we found that our strategies result in 60% improvement in response time and 16% improvement in the rate of visualization of frames over the existing strategy of performing fix-ups and visualization at the end of the timesteps.