Display of Surfaces from Volume Data
IEEE Computer Graphics and Applications
Volume rendering on scalable shared-memory MIMD architectures
VVS '92 Proceedings of the 1992 workshop on Volume visualization
A data distributed, parallel algorithm for ray-traced volume rendering
PRS '93 Proceedings of the 1993 symposium on Parallel rendering
Parallel volume ray-casting for unstructured-grid data on distributed-memory architectures
PRS '95 Proceedings of the IEEE symposium on Parallel rendering
Programming with POSIX threads
Programming with POSIX threads
Parallel programming in OpenMP
Parallel programming in OpenMP
A rendering algorithm for visualizing 3D scalar fields
SIGGRAPH '88 Proceedings of the 15th annual conference on Computer graphics and interactive techniques
V-buffer: visible volume rendering
SIGGRAPH '88 Proceedings of the 15th annual conference on Computer graphics and interactive techniques
SIGGRAPH '88 Proceedings of the 15th annual conference on Computer graphics and interactive techniques
MPI-The Complete Reference, Volume 1: The MPI Core
MPI-The Complete Reference, Volume 1: The MPI Core
PVR: High-Performance Volume Rendering
IEEE Computational Science & Engineering
Interactive Texture-Based Volume Rendering for Large Data Sets
IEEE Computer Graphics and Applications
UPC: Distributed Shared Memory Programming (Wiley Series on Parallel and Distributed Computing)
UPC: Distributed Shared Memory Programming (Wiley Series on Parallel and Distributed Computing)
Methodology for modelling SPMD hybrid parallel computation
Concurrency and Computation: Practice & Experience
Intel threading building blocks
Intel threading building blocks
Performance Evaluation of MPI, UPC and OpenMP on Multicore Architectures
Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
A configurable algorithm for parallel image-compositing applications
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
End-to-End Study of Parallel Volume Rendering on the IBM Blue Gene/P
ICPP '09 Proceedings of the 2009 International Conference on Parallel Processing
A scalable, hybrid scheme for volume rendering massive data sets
EG PGV'06 Proceedings of the 6th Eurographics conference on Parallel Graphics and Visualization
Large data visualization on distributed memory multi-GPU clusters
Proceedings of the Conference on High Performance Graphics
Challenges of medical image processing
Computer Science - Research and Development
An image compositing solution at scale
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Revisiting parallel rendering for shared memory machines
EG PGV'11 Proceedings of the 11th Eurographics conference on Parallel Graphics and Visualization
Real-time ray tracer for visualizing massive models on a cluster
EG PGV'11 Proceedings of the 11th Eurographics conference on Parallel Graphics and Visualization
Parallel I/O, analysis, and visualization of a trillion particle simulation
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Hi-index | 0.00 |
This work studies the performance and scalability characteristics of "hybrid" parallel programming and execution as applied to raycasting volume rendering — a staple visualization algorithm — on a large, multi-core platform. Historically, the Message Passing Interface (MPI) has become the de-facto standard for parallel programming and execution on modern parallel systems. As the computing industry trends towards multi-core processors, with fourand six-core chips common today and 128-core chips coming soon, we wish to better understand how algorithmic and parallel programming choices impact performance and scalability on large, distributed-memory multi-core systems. Our findings indicate that the hybrid-parallel implementation, at levels of concurrency ranging from 1,728 to 216,000, performs better, uses a smaller absolute memory footprint, and consumes less communication bandwidth than the traditional, MPI-only implementation.