VLSI array processors
Synthesizing Linear Array Algorithms from Nested FOR Loop Algorithms
IEEE Transactions on Computers
Time Optimal Linear Schedules for Algorithms with Uniform Dependencies
IEEE Transactions on Computers
Iterative modulo scheduling: an algorithm for software pipelining loops
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
The Organization of Computations for Uniform Recurrence Equations
Journal of the ACM (JACM)
Modulo scheduling for a fully-distributed clustered VLIW architecture
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
The parallel execution of DO loops
Communications of the ACM
Area-Efficient VLSI Computation
Area-Efficient VLSI Computation
Parallel Processing: From Applications to Systems
Parallel Processing: From Applications to Systems
High Performance Compilers for Parallel Computing
High Performance Compilers for Parallel Computing
Application of Reconfigurable Computing to a High Performance Front-End Radar Signal Processor
Journal of VLSI Signal Processing Systems
Design Study of Shared Memory in VLIW Video Signal Processors
PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques
Instruction Scheduling for Clustered VLIW DSPs
PACT '00 Proceedings of the 2000 International Conference on Parallel Architectures and Compilation Techniques
Alternative Architectures for Video Signal Processing
WVLSI '00 Proceedings of the IEEE Computer Society Annual Workshop on VLSI (WVLSI'00)
A design study of a 0.25-μm video signal processor
IEEE Transactions on Circuits and Systems for Video Technology
Frame-level pipelined motion estimation array processor
IEEE Transactions on Circuits and Systems for Video Technology
A novel modular systolic array architecture for full-search block matching motion estimation
IEEE Transactions on Circuits and Systems for Video Technology
Hi-index | 0.00 |
A novel dependence graph representation called the multiple-order dependence graph for nested-loop formulated multimedia signal processing algorithms is proposed. It allows a concise representation of an entire family of dependence graphs. This powerful representation facilitates the development of innovative implementation approach for nested-loop formulated multimedia algorithms such as motion estimation, matrix-matrix product, 2D linear transform, and others. In particular, algebraic linear mapping (assignment and scheduling) methodology can be applied to implement such algorithms on an array of simple-processing elements. The feasibility of this new approach is demonstrated in three major target architectures: application-specific integrated circuit (ASIC), field programmable gate array (FPGA), and a programmable clustered VLIW processor.