Radiosity and realistic image synthesis
Radiosity and realistic image synthesis
Efficient parallel global illumination using density estimation
PRS '95 Proceedings of the IEEE symposium on Parallel rendering
A multiphase approach to efficient surface simplification
Proceedings of the conference on Visualization '02
Parallel Global Illumination Method Based on a Non-Uniform Partitioning of the Scene
PDP '05 Proceedings of the 13th Euromicro Conference on Parallel, Distributed and Network-Based Processing
Advanced Global Illumination
Implicit visibility and antiradiance for interactive global illumination
ACM SIGGRAPH 2007 papers
Interactive Global Illumination Using Implicit Visibility
PG '07 Proceedings of the 15th Pacific Conference on Computer Graphics and Applications
Progressive radiosity method on clusters using a new clipping algorithm
International Journal of High Performance Computing and Networking
Imperfect shadow maps for efficient computation of indirect illumination
ACM SIGGRAPH Asia 2008 papers
High Performance Global Illumination on Multi-core Architectures
PDP '09 Proceedings of the 2009 17th Euromicro International Conference on Parallel, Distributed and Network-based Processing
Cascaded light propagation volumes for real-time indirect illumination
Proceedings of the 2010 ACM SIGGRAPH symposium on Interactive 3D Graphics and Games
Optimizing Monte Carlo radiosity on graphics hardware
The Journal of Supercomputing
Data-parallel hierarchical link creation for radiosity
EG PGV'09 Proceedings of the 9th Eurographics conference on Parallel Graphics and Visualization
Hi-index | 0.00 |
The recent interest in GPGPU, (General-Purpose computation on Graphics Processing Unit), has stimulated improvements in the programmability of the GPU. Although the utilization of new languages like OpenCL and CUDA facilitate GPU programming, different challenges have to be overcome to optimize the results of a direct implementation. Specifically, a straightforward implementation of the Monte Carlo radiosity algorithm on the GPU does not produce the expected performance. In this paper we develop different strategies to increase the performance of the implementation: utilization of an additional simplified version of the mesh to reduce the computational requirements, data partitioning of the scene to increase the data locality, and an efficient thread scheduling to exploit the characteristics of the GPU. Our approach increases the flexibility of previous solutions and the results show a significant improvement of the execution time.