Circuits, Systems, and Signal Processing
Computational frameworks for the fast Fourier transform
Computational frameworks for the fast Fourier transform
A SIMD Vectorizing Compiler for Digital Signal Processing Algorithms
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Short Vector Code Generation for the Discrete Fourier Transform
IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
FFT program generation for shared memory: SMP and multicore
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Automatic performance optimization of the discrete fourier transform on distributed memory computers
ISPA'06 Proceedings of the 4th international conference on Parallel and Distributed Processing and Applications
FFT program generation for shared memory: SMP and multicore
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
System Demonstration of Spiral: Generator for High-Performance Linear Transform Libraries
AMAST 2008 Proceedings of the 12th international conference on Algebraic Methodology and Software Technology
Computer generation of fast fourier transforms for the cell broadband engine
Proceedings of the 23rd international conference on Supercomputing
Computer Generation of General Size Linear Transform Libraries
Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and Optimization
Operator Language: A Program Generation Framework for Fast Kernels
DSL '09 Proceedings of the IFIP TC 2 Working Conference on Domain-Specific Languages
Automatic synthesis of high performance mathematical programs
Proceedings of the 2009 international symposium on Symbolic and algebraic computation
Generating SIMD vectorized permutations
CC'08/ETAPS'08 Proceedings of the Joint European Conferences on Theory and Practice of Software 17th international conference on Compiler construction
Proceedings of the 4th International Workshop on Parallel and Symbolic Computation
Spiral-generated modular FFT algorithms
Proceedings of the 4th International Workshop on Parallel and Symbolic Computation
Automatic SIMD vectorization of fast fourier transforms for the larrabee and AVX instruction sets
Proceedings of the international conference on Supercomputing
Computer Generation of Hardware for Linear Digital Signal Processing Transforms
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Automatic performance optimization of the discrete fourier transform on distributed memory computers
ISPA'06 Proceedings of the 4th international conference on Parallel and Distributed Processing and Applications
Computer generation of efficient software viterbi decoders
HiPEAC'10 Proceedings of the 5th international conference on High Performance Embedded Architectures and Compilers
Spiral in scala: towards the systematic construction of generators for performance libraries
Proceedings of the 12th international conference on Generative programming: concepts & experiences
Hi-index | 0.00 |
We present a rewriting system that automatically vectorizes signal transform algorithms at a high level of abstraction. The input to the system is a transform algorithm given as a formula in the well-known Kronecker product formalism. The output is a "vectorized" formula, which means it consists exclusively of constructs that can be directly mapped into short vector code. This approach obviates compiler vectorization, which is known to be limited in this domain. We included the formula vectorization into the Spiral program generator for signal transforms, which enables us to generate vectorized code and further optimize for the memory hierarchy through search over alternative algorithms. Benchmarks for the discrete Fourier transform (DFT) show that our generated floating-point code is competitive with and that our fixed-point code clearly outperforms the best available libraries.