Accelerating 3D convolution using graphics hardware (case study)
VIS '99 Proceedings of the conference on Visualization '99: celebrating ten years
Internet Streaming SIMD Extensions
Computer
A Loop Transformation Theory and an Algorithm to Maximize Parallelism
IEEE Transactions on Parallel and Distributed Systems
FFT and Convolution Performance in Image Filtering on GPU
IV '06 Proceedings of the conference on Information Visualization
Optimizing Gaussian filtering of volumetric data using SSE
Concurrency and Computation: Practice & Experience
High performance 3D convolution for protein docking on IBM blue gene
ISPA'07 Proceedings of the 5th international conference on Parallel and Distributed Processing and Applications
Hi-index | 0.00 |
Optimizing convolution operators is an important issue as they are used in numerous domains including electromagnetic computations, image processing and nanosimuations. In this paper we present our optimizations for 3D convolutions in the BigDFT nanosimulation software. We focus on processors with vector units and on GPU acceleration and experiment with several architectures. Exploiting the relation between algorithmic specifics and hardware architecture, we obtain performance gains of around x2 on CPU and up to x20 on GPU.