Optimal evaluation of vector expression trees
JCIT Proceedings of the fifth Jerusalem conference on Information technology
Detecting coarse-grain parallelism using an interprocedural parallelizing compiler
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Code Generation for a One-Register Machine
Journal of the ACM (JACM)
Code Generation for Expressions with Common Subexpressions
Journal of the ACM (JACM)
High Performance Compilers for Parallel Computing
High Performance Compilers for Parallel Computing
Parallel Computers Two: Architecture, Programming and Algorithms
Parallel Computers Two: Architecture, Programming and Algorithms
A Code Motion Framework for Global Instruction Scheduling
CC '98 Proceedings of the 7th International Conference on Compiler Construction
Morphological Image Analysis: Principles and Applications
Morphological Image Analysis: Principles and Applications
Using Algebraic Transformations to Optimize Expression Evaluation in Scientific Code
PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques
Computer Architecture: A Quantitative Approach
Computer Architecture: A Quantitative Approach
Compilers: Principles, Techniques, and Tools (2nd Edition)
Compilers: Principles, Techniques, and Tools (2nd Edition)
Trace Scheduling: A Technique for Global Microcode Compaction
IEEE Transactions on Computers
Definition and SIMD implementation of a multi-processing architecture approach on FPGA
Proceedings of the conference on Design, automation and test in Europe
Compilation Techniques for Reconfigurable Architectures
Compilation Techniques for Reconfigurable Architectures
Storage requirements for deterministic polynomialtime recognizable languages
Journal of Computer and System Sciences
Multidimensional Systems and Signal Processing
Lime: a Java-compatible and synthesizable language for heterogeneous architectures
Proceedings of the ACM international conference on Object oriented programming systems languages and applications
Decoupling algorithms from schedules for easy optimization of image processing pipelines
ACM Transactions on Graphics (TOG) - SIGGRAPH 2012 Conference Proceedings
Diderot: a parallel DSL for image analysis and visualization
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Hi-index | 0.00 |
We present an API-based compilation strategy to optimize image applications, developed using a high-level image processing library, onto three different image processing hardware accelerators. We demonstrate that such a strategy is profitable for both development cost and overall performance, especially as it takes advantage of optimization opportunities across library calls otherwise beyond reach. The library API provides the semantics of the image computations. The three image accelerator targets are quite distinct: the first one uses a vector architecture; the second one presents an SIMD architecture; the last one runs both on GPGPU and multicores through OpenCL. We have adapted standard compilation techniques to perform these compilation and code generation tasks automatically. Our strategy is implemented in PIPS, a source-to-source compiler which greatly reduces the development cost as standard phases are reused and parameterized. We carried out experiments with applications on hardware functional simulators and GPUs. Our contributions include: (1) a general low-cost compilation strategy for image processing applications, based on the semantics provided by library calls, which improves locality by an order of magnitude; (2) specific heuristics to minimize execution time on the target accelerators; (3) numerous experiments that show the effectiveness of our strategies. We also discuss the conditions required to extend this approach to other application domains.