Pipeline rendering: interaction and realism through hardware-based multi-pass rendering
Pipeline rendering: interaction and realism through hardware-based multi-pass rendering
The Generation of Optimal Code for Arithmetic Expressions
Journal of the ACM (JACM)
Interactive multi-pass programmable shading
Proceedings of the 27th annual conference on Computer graphics and interactive techniques
Efficient partitioning of fragment shaders for multipass rendering on programmable graphics hardware
Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware
Job Shop Scheduling with Genetic Algorithms
Proceedings of the 1st International Conference on Genetic Algorithms
Solving Project Scheduling Problems by Minimum Cut Computations
Management Science
Mio: fast multipass partitioning via priority-based instruction scheduling
Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware
Efficient partitioning of fragment shaders for multiple-output hardware
Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware
Adaptive Partitioning of Vertex Shader for Low Power High Performance Geometry Engine
ISVC '09 Proceedings of the 5th International Symposium on Advances in Visual Computing: Part I
Control flow emulation on tiled SIMD architectures
CC'08/ETAPS'08 Proceedings of the Joint European Conferences on Theory and Practice of Software 17th international conference on Compiler construction
Hi-index | 0.00 |
Complex shaders must be partitioned into multiple passes to execute on GPUs with limited hardware resources. Automatic partitioning gives rise to an NP-hard scheduling problem that can be solved by any number of established techniques. One such technique, Dynamic Programming (DP), is commonly used for instruction scheduling and register allocation in the code generation phase of compilers. Since automatic partitioning occurs during the shader compilation process it is natural to ask whether DP is useful for shader partitioning as well as for code generation. This paper demonstrates that these problems are Markovian and can be solved by DP techniques. It presents a DP algorithm for shader partitioning that can be adapted for use with any GPU architecture. Unlike solutions produced by other techniques DP solutions are globally optimal. Experimental results on a set of test cases with a commercial prerelease compiler for a popular high level shading language showed a DP algorithm had an average runtime cost of O(n1.14966) which is less than O(n log n) on the region of interest in n. This demonstrates that efficient and optimal automatic shader partitioning can be an emergent byproduct of a DP-based code generator for a very high performance GPU.