SIGGRAPH '94 Proceedings of the 21st annual conference on Computer graphics and interactive techniques
A model for efficient and flexible image computing
SIGGRAPH '94 Proceedings of the 21st annual conference on Computer graphics and interactive techniques
Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware
A stream compiler for communication-exposed architectures
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Journal of Functional Programming
Sequoia: programming the memory hierarchy
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Sequoia: programming the memory hierarchy
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
GPU Computing: Programming a Massively Parallel Processor
Proceedings of the International Symposium on Code Generation and Optimization
Real-time edge-aware image processing with the bilateral grid
ACM SIGGRAPH 2007 papers
A Fast Approximation of the Bilateral Filter Using a Signal Processing Approach
International Journal of Computer Vision
The Frankencamera: an experimental platform for computational photography
ACM SIGGRAPH 2010 papers
Bilateral Filtering
Distance regularized level set evolution and its application to image segmentation
IEEE Transactions on Image Processing
Local Laplacian filters: edge-aware image processing with a Laplacian pyramid
ACM SIGGRAPH 2011 papers
Intel's Array Building Blocks: A retargetable, dynamic compiler and embedded language
CGO '11 Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization
Open platforms for computational photography: technical perspective
Communications of the ACM
API compilation for image hardware accelerators
ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
Terra: a multi-stage language for high-performance computing
Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation
Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation
A sparse control model for image and video editing
ACM Transactions on Graphics (TOG)
WYSIWYG computational photography via viewfinder editing
ACM Transactions on Graphics (TOG)
Singe: leveraging warp specialization for high performance on GPUs
Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming
Proceedings of the 2013 International Conference on Compilers, Architectures and Synthesis for Embedded Systems
Computational photography comes into focus
Communications of the ACM
Hi-index | 0.03 |
Using existing programming tools, writing high-performance image processing code requires sacrificing readability, portability, and modularity. We argue that this is a consequence of conflating what computations define the algorithm, with decisions about storage and the order of computation. We refer to these latter two concerns as the schedule, including choices of tiling, fusion, recomputation vs. storage, vectorization, and parallelism. We propose a representation for feed-forward imaging pipelines that separates the algorithm from its schedule, enabling high-performance without sacrificing code clarity. This decoupling simplifies the algorithm specification: images and intermediate buffers become functions over an infinite integer domain, with no explicit storage or boundary conditions. Imaging pipelines are compositions of functions. Programmers separately specify scheduling strategies for the various functions composing the algorithm, which allows them to efficiently explore different optimizations without changing the algorithmic code. We demonstrate the power of this representation by expressing a range of recent image processing applications in an embedded domain specific language called Halide, and compiling them for ARM, x86, and GPUs. Our compiler targets SIMD units, multiple cores, and complex memory hierarchies. We demonstrate that it can handle algorithms such as a camera raw pipeline, the bilateral grid, fast local Laplacian filtering, and image segmentation. The algorithms expressed in our language are both shorter and faster than state-of-the-art implementations.