Applied numerical linear algebra
Applied numerical linear algebra
Comparison of second- and fourth-order discretizations for multigrid Poisson solvers
Journal of Computational Physics
Proceedings of the 26th annual conference on Computer graphics and interactive techniques
Proceedings of the 28th annual conference on Computer graphics and interactive techniques
Practical animation of liquids
Proceedings of the 28th annual conference on Computer graphics and interactive techniques
A hybrid particle level set method for improved interface capturing
Journal of Computational Physics
A multigrid solver for boundary value problems using programmable graphics hardware
Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware
Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware
Sparse matrix solvers on the GPU: conjugate gradients and multigrid
ACM SIGGRAPH 2003 Papers
Fluid animation with dynamic meshes
ACM SIGGRAPH 2006 Papers
Mapping computational concepts to GPUs
SIGGRAPH '05 ACM SIGGRAPH 2005 Courses
Fast fluid dynamics simulation on the GPU
SIGGRAPH '05 ACM SIGGRAPH 2005 Courses
Simulation of bubbles in foam with the volume control method
ACM SIGGRAPH 2007 papers
Journal of Parallel and Distributed Computing
Calligraphic video: a phenomenological approach to dense visual interaction
MM '09 Proceedings of the 17th ACM international conference on Multimedia
Filament-based smoke with vortex shedding and variational reconnection
ACM SIGGRAPH 2010 papers
World-highest resolution global atmospheric model and its performance on the Earth Simulator
State of the Practice Reports
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Calligraphic Video: Using the Body's Intuition of Matter
International Journal of Creative Interfaces and Computer Graphics
Hi-index | 0.00 |
We perform a detailed flop and bandwidth analysis of Jos Stam's Stable Fluids algorithm on the CPU, GPU, and Cell. In all three cases, we find that the algorithm is bandwidth bound, with the cores sitting idle up to 96% of the time. Knowing this, we propose two modifications to accelerate the algorithm. First, a Mehrstellen discretization for the pressure solver which reduces the running time of the solver by a third. Second, a static caching scheme that eliminates roughly 99% of the random lookups in the advection stage. We observe a 2x speedup in the advection stage using this scheme. Both modifications apply equally well to all three architectures.