Learning to construct fast signal processing implementations

Authors:
Bryan Singer;Manuela Veloso
Affiliations:
Department of Computer Science, Carnegie Mellon University, Pittsburgh, PA;Department of Computer Science, Carnegie Mellon University, Pittsburgh, PA
Venue:
The Journal of Machine Learning Research
Year:
2003

Citing 7
Cited 6

Discrete cosine transform: algorithms, advantages, applications

Discrete cosine transform: algorithms, advantages, applications
High-level optimization via automated statistical modeling

PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Stochastic search for signal processing algorithm optimization

Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Learning to Predict Performance from Formula Modeling and Training Data

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Algorithm Selection using Reinforcement Learning

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Spiral: A Generator for Platform-Adapted Libraries of Signal Processing Algorithms

International Journal of High Performance Computing Applications
In search of the optimal Walsh-Hadamard transform

ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 06

Finite State Machine-Based Optimization of Data Parallel Regular Domain Problems Applied in Low-Level Image Processing

IEEE Transactions on Parallel and Distributed Systems
Mapping of Discrete Cosine Transforms onto Distributed Hardware Architectures

Journal of Signal Processing Systems
Bandit-based optimization on graphs with application to library performance tuning

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Automatic performance programming

Proceedings of the 10th SIGPLAN symposium on New ideas, new paradigms, and reflections on programming and software
Compiling math to fast code

PEPM '12 Proceedings of the ACM SIGPLAN 2012 workshop on Partial evaluation and program manipulation
An FFT performance model for optimizing general-purpose processor architecture

Journal of Computer Science and Technology - Special issue on Community Analysis and Information Recommendation

Quantified Score

Hi-index	0.00

Visualization

Abstract

A single signal processing algorithm can be represented by many mathematically equivalent formulas. However, when these formulas are implemented in code and run on real machines, they have very different runtimes. Unfortunately, it is extremely difficult to model this broad performance range. Further, the space of formulas for real signal transforms is so large that it is impossible to search it exhaustively for fast implementations. We approach this search question as a control learning problem. We present a new method for learning to generate fast formulas, allowing us to intelligently search through only the most promising formulas. Our approach incorporates signal processing knowledge, hardware features, and formula performance data to learn to construct fast formulas. Our method learns from performance data for a few formulas of one size and then can construct formulas that will have the fastest runtimes possible across many sizes.