MPI-hybrid parallelism for volume rendering on large, multi-core systems

Authors:
M. Howison;E. W. Bethel;H. Childs
Affiliations:
Visualization Group, Lawrence Berkeley National Lab;Visualization Group, Lawrence Berkeley National Lab;Visualization Group, Lawrence Berkeley National Lab
Venue:
EG PGV'10 Proceedings of the 10th Eurographics conference on Parallel Graphics and Visualization
Year:
2010

Citing 19
Cited 6

Display of Surfaces from Volume Data

IEEE Computer Graphics and Applications
Volume rendering on scalable shared-memory MIMD architectures

VVS '92 Proceedings of the 1992 workshop on Volume visualization
A data distributed, parallel algorithm for ray-traced volume rendering

PRS '93 Proceedings of the 1993 symposium on Parallel rendering
Parallel volume ray-casting for unstructured-grid data on distributed-memory architectures

PRS '95 Proceedings of the IEEE symposium on Parallel rendering
Programming with POSIX threads

Programming with POSIX threads
Parallel programming in OpenMP

Parallel programming in OpenMP
A rendering algorithm for visualizing 3D scalar fields

SIGGRAPH '88 Proceedings of the 15th annual conference on Computer graphics and interactive techniques
V-buffer: visible volume rendering

SIGGRAPH '88 Proceedings of the 15th annual conference on Computer graphics and interactive techniques
Volume rendering

SIGGRAPH '88 Proceedings of the 15th annual conference on Computer graphics and interactive techniques
MPI-The Complete Reference, Volume 1: The MPI Core

MPI-The Complete Reference, Volume 1: The MPI Core
PVR: High-Performance Volume Rendering

IEEE Computational Science & Engineering
Interactive Texture-Based Volume Rendering for Large Data Sets

IEEE Computer Graphics and Applications
UPC: Distributed Shared Memory Programming (Wiley Series on Parallel and Distributed Computing)

UPC: Distributed Shared Memory Programming (Wiley Series on Parallel and Distributed Computing)
Methodology for modelling SPMD hybrid parallel computation

Concurrency and Computation: Practice & Experience
Intel threading building blocks

Intel threading building blocks
Performance Evaluation of MPI, UPC and OpenMP on Multicore Architectures

Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
A configurable algorithm for parallel image-compositing applications

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
End-to-End Study of Parallel Volume Rendering on the IBM Blue Gene/P

ICPP '09 Proceedings of the 2009 International Conference on Parallel Processing
A scalable, hybrid scheme for volume rendering massive data sets

EG PGV'06 Proceedings of the 6th Eurographics conference on Parallel Graphics and Visualization

Large data visualization on distributed memory multi-GPU clusters

Proceedings of the Conference on High Performance Graphics
Challenges of medical image processing

Computer Science - Research and Development
An image compositing solution at scale

Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Revisiting parallel rendering for shared memory machines

EG PGV'11 Proceedings of the 11th Eurographics conference on Parallel Graphics and Visualization
Real-time ray tracer for visualizing massive models on a cluster

EG PGV'11 Proceedings of the 11th Eurographics conference on Parallel Graphics and Visualization
Parallel I/O, analysis, and visualization of a trillion particle simulation

SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

This work studies the performance and scalability characteristics of "hybrid" parallel programming and execution as applied to raycasting volume rendering — a staple visualization algorithm — on a large, multi-core platform. Historically, the Message Passing Interface (MPI) has become the de-facto standard for parallel programming and execution on modern parallel systems. As the computing industry trends towards multi-core processors, with fourand six-core chips common today and 128-core chips coming soon, we wish to better understand how algorithmic and parallel programming choices impact performance and scalability on large, distributed-memory multi-core systems. Our findings indicate that the hybrid-parallel implementation, at levels of concurrency ranging from 1,728 to 216,000, performs better, uses a smaller absolute memory footprint, and consumes less communication bandwidth than the traditional, MPI-only implementation.