Analysis of a Parallel Volume Rendering System Based on the Shear-Warp Factorization

  • Authors:
  • Philippe Lacroute

  • Affiliations:
  • -

  • Venue:
  • IEEE Transactions on Visualization and Computer Graphics
  • Year:
  • 1996

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a parallel volume rendering algorithm that can render a 256 脳 256 脳 225 voxel medical data set at over 15 Hz and a 512 脳 512 脳 334 voxel data set at over 7 Hz on a 32-processor Silicon Graphics Challenge. The algorithm achieves these results by minimizing each of the three components of execution time: computation time, synchronization time, and data communication time. Computation time is low because the parallel algorithm is based on the recently-reported shear-warp serial volume rendering algorithm which is over five times faster than previous serial algorithms. The algorithm uses run-length encoding to exploit coherence and an efficient volume traversal to reduce overhead. Synchronization time is minimized by using dynamic load balancing and a task partition that minimizes synchronization events. Data communication costs are low because the algorithm is implemented for shared-memory multiprocessors, a class of machines with hardware support for low-latency fine-grain communication and hardware caching to hide latency.We draw two conclusions from our implementation. First, we find that on shared-memory architectures data redistribution and communication costs do not dominate rendering time. Second, we find that cache locality requirements impose a limit on parallelism in volume rendering algorithms. Specifically, our results indicate that shared-memory machines with hundreds of processors would be useful only for rendering very large data sets.