A hardware unit for fast SAH-optimised BVH construction

Authors:
Michael J. Doyle;Colin Fowler;Michael Manzke
Affiliations:
Trinity College Dublin;Trinity College Dublin;Trinity College Dublin
Venue:
ACM Transactions on Graphics (TOG) - SIGGRAPH 2013 Conference Proceedings
Year:
2013

Citing 23
Cited 0

Realtime ray tracing of dynamic scenes on an FPGA chip

Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware
RPU: a programmable ray processing unit for realtime ray tracing

ACM SIGGRAPH 2005 Papers
B-KD trees for hardware accelerated ray tracing of dynamic scenes

GH '06 Proceedings of the 21st ACM SIGGRAPH/EUROGRAPHICS symposium on Graphics hardware
Optimizing NUCA Organizations and Wiring Alternatives for Large Caches with CACTI 6.0

Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
On fast Construction of SAH-based Bounding Volume Hierarchies

RT '07 Proceedings of the 2007 IEEE Symposium on Interactive Ray Tracing
Spatial splits in bounding volume hierarchies

Proceedings of the Conference on High Performance Graphics 2009
TRaX: a multicore hardware architecture for real-time ray tracing

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Conservation cores: reducing the energy of mature computations

Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
OptiX: a general purpose ray tracing engine

ACM SIGGRAPH 2010 papers
HLBVH: hierarchical LBVH construction for real-time ray tracing of dynamic geometry

Proceedings of the Conference on High Performance Graphics
Single-Chip Heterogeneous Computing: Does the Future Include Custom Logic, FPGAs, and GPGPUs?

MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Fermi GF100 GPU Architecture

IEEE Micro
Dark silicon and the end of multicore scaling

Proceedings of the 38th annual international symposium on Computer architecture
Simpler and faster HLBVH with work queues

Proceedings of the ACM SIGGRAPH Symposium on High Performance Graphics
T&I engine: traversal and intersection engine for hardware accelerated ray tracing

Proceedings of the 2011 SIGGRAPH Asia Conference
Power, Programmability, and Granularity: The Challenges of ExaScale Computing

IPDPS '11 Proceedings of the 2011 IEEE International Parallel & Distributed Processing Symposium
Fast Construction of SAH BVHs on the Intel Many Integrated Core (MIC) Architecture

IEEE Transactions on Visualization and Computer Graphics
Fast, effective BVH updates for animated scenes

I3D '12 Proceedings of the ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games
Hardware accelerated construction of SAH-based bounding volume hierarchies for interactive ray tracing

I3D '12 Proceedings of the ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games
SGRT: a scalable mobile GPU architecture based on ray tracing

ACM SIGGRAPH 2012 Posters
Interactive global photon mapping

EGSR'09 Proceedings of the Twentieth Eurographics conference on Rendering
Maximizing parallelism in the construction of BVHs, octrees, and k-d trees

EGGH-HPG'12 Proceedings of the Fourth ACM SIGGRAPH / Eurographics conference on High-Performance Graphics
Power efficiency for software algorithms running on graphics processors

EGGH-HPG'12 Proceedings of the Fourth ACM SIGGRAPH / Eurographics conference on High-Performance Graphics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Ray-tracing algorithms are known for producing highly realistic images, but at a significant computational cost. For this reason, a large body of research exists on various techniques for accelerating these costly algorithms. One approach to achieving superior performance which has received comparatively little attention is the design of specialised ray-tracing hardware. The research that does exist on this topic has consistently demonstrated that significant performance and efficiency gains can be achieved with dedicated microarchitectures. However, previous work on hardware ray-tracing has focused almost entirely on the traversal and intersection aspects of the pipeline. As a result, the critical aspect of the management and construction of acceleration data-structures remains largely absent from the hardware literature. We propose that a specialised microarchitecture for this purpose could achieve considerable performance and efficiency improvements over programmable platforms. To this end, we have developed the first dedicated microarchitecture for the construction of binned SAH BVHs. Cycle-accurate simulations show that our design achieves significant improvements in raw performance and in the bandwidth required for construction, as well as large efficiency gains in terms of performance per clock and die area compared to manycore implementations. We conclude that such a design would be useful in the context of a heterogeneous graphics processor, and may help future graphics processor designs to reduce predicted technology-imposed utilisation limits.