Efficient hardware implementation of Ray Tracing based on an embedded software for intersection computation

  • Authors:
  • Alexandre S. Nery;Nadia Nedjah;Felipe M. G. FrançA

  • Affiliations:
  • LAM - Computer Architecture and Microelectronics Laboratory Systems Engineering and Computer Science Program, COPPE Universidade Federal do Rio de Janeiro, Brazil;Department of Electronics Engineering and Telecommunications, Faculty of Engineering Universidade do Estado do Rio de Janeiro, Brazil;LAM - Computer Architecture and Microelectronics Laboratory Systems Engineering and Computer Science Program, COPPE Universidade Federal do Rio de Janeiro, Brazil

  • Venue:
  • Journal of Systems Architecture: the EUROMICRO Journal
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Parallel implementations of Ray Tracing have been enabling real time performance, as the algorithm is embarrassingly parallel. However, in order to achieve both interactivity and real time performance, the algorithm should run at a high frame rates, i.e. at least 60 frames per second. Thus, a custom parallel design in hardware is likely to achieve high rendering performance. In this paper, we improve the GridRT architecture presented in previous work. GridRT is capable of dealing with the main desirable features of Ray Tracing, such as shadows and reflection effects, imposing low area cost and a promising rendering performance. As to this work, an application-specific instruction has been added and the underlaying computation embedded into the processor's microprogram in order to calculate the ray-triangle intersection computations. These computations are performed in pipeline, whenever possible, yielding to a considerable reduction in terms of cycles per intersection test. The presented architecture is based on the uniform grid acceleration structure. It allows for a massive twofold parallelism: parallel ray-triangle intersection tests as well as parallel processing of many rays. A hardware implementation of the improved architecture is presented, together with the corresponding performance results and resources requirements. The rendering time is reduced by 80% using a grid configuration of eight processing elements and each intersection computation time is reduced by 50% with respect to the original GridRT implementation.