Scheduling of Iterative Algorithms with Matrix Operations for Efficient FPGA Design--Implementation of Finite Interval Constant Modulus Algorithm

Authors:
Přemysl Šůcha;Zdeník Hanzálek;Antonín Heřmánek;Jan Schier
Affiliations:
Centre for Applied Cybernetics, Department of Control Engineering, Faculty of Electrical Engineering, Czech Technical University in Prague, Prague, Czech Republic;Centre for Applied Cybernetics, Department of Control Engineering, Faculty of Electrical Engineering, Czech Technical University in Prague, Prague, Czech Republic;Institute of Information Theory and Automation, Academy of Sciences of the Czech Republic, Prague, Czech Republic;Institute of Information Theory and Automation, Academy of Sciences of the Czech Republic, Prague, Czech Republic
Venue:
Journal of VLSI Signal Processing Systems
Year:
2007

Citing 13
Cited 3

Software pipelining: an effective scheduling technique for VLIW machines

PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
A systolic algorithm for QSVD updating

Signal Processing - Theme issue on singular value decomposition
A study of the cyclic scheduling problem on parallel processors

Discrete Applied Mathematics - Special issue: Combinatorial Optimization 1992 (CO92)
Loop Shifting for Loop Compaction

International Journal of Parallel Programming - Special issue on instruction-level parallelism and parallelizing compilation, part 2
Tiling imperfectly-nested loop nests

Proceedings of the 2000 ACM/IEEE conference on Supercomputing
High Performance Compilers for Parallel Computing

High Performance Compilers for Parallel Computing
PICO-NPA: High-Level Synthesis of Nonprogrammable Hardware Accelerators

Journal of VLSI Signal Processing Systems
Efficient Pipelining of Nested Loops: Unroll-and-Squash

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Logarithmic Number System and Floating-Point Arithmetics on FPGA

FPL '02 Proceedings of the Reconfigurable Computing Is Going Mainstream, 12th International Conference on Field-Programmable Logic and Applications
Improving Software Pipelining With Unroll-and-Jam

HICSS '96 Proceedings of the 29th Hawaii International Conference on System Sciences Volume 1: Software Technology and Architecture
Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing

MICRO 14 Proceedings of the 14th annual workshop on Microprogramming
Loop Shifting and Compaction for the High-Level Synthesis of Designs with Complex Control Flow

Proceedings of the conference on Design, automation and test in Europe - Volume 1
Scheduling of Iterative Algorithms on FPGA with Pipelined Arithmetic Unit

RTAS '04 Proceedings of the 10th IEEE Real-Time and Embedded Technology and Applications Symposium

Implementation of the least-squares lattice with order and forgetting factor estimation for FPGA

EURASIP Journal on Advances in Signal Processing
A truly two-dimensional systolic array FPGA implementation of QR decomposition

ACM Transactions on Embedded Computing Systems (TECS)
A cyclic scheduling problem with an undetermined number of parallel identical processors

Computational Optimization and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper deals with the optimization of iterative algorithms with matrix operations or nested loops for hardware implementation in Field Programmable Gate Arrays (FPGA), using Integer Linear Programming (ILP). The method is demonstrated on an implementation of the Finite Interval Constant Modulus Algorithm. It is an equalization algorithm, suitable for modern communication systems (4G and behind). For the floating-point calculations required in the algorithm, two arithmetic libraries were used in the FPGA implementation: one based on the logarithmic number system, the other using floating-point number system in the standard IEEE format. Both libraries use pipelined modules. Traditional approaches to the scheduling of nested loops lead to a relatively large code, which is unsuitable for FPGA implementation. This paper presents a new high-level synthesis methodology, which models both, iterative loops and imperfectly nested loops, by means of the system of linear inequalities. Moreover, memory access is considered as an additional resource constraint. Since the solutions of ILP formulated problems are known to be computationally intensive, an important part of the article is devoted to the reduction of the problem size.