IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Edge separability based circuit clustering with application to circuit partitioning
ASP-DAC '00 Proceedings of the 2000 Asia and South Pacific Design Automation Conference
A high-speed FIR digital filter with CSD coefficients implemented on FPGA
Proceedings of the 2001 Asia and South Pacific Design Automation Conference
Journal of VLSI Signal Processing Systems
FIR filter synthesis algorithms for minimizing the delay and the number of adders
Proceedings of the 2000 IEEE/ACM international conference on Computer-aided design
VPR: A new packing, placement and routing tool for FPGA research
FPL '97 Proceedings of the 7th International Workshop on Field-Programmable Logic and Applications
Efficient Constant Coefficient Multiplication Using Advanced FPGA Architectures
FPL '01 Proceedings of the 11th International Conference on Field-Programmable Logic and Applications
Wire length prediction based clustering and its application in placement
Proceedings of the 40th annual Design Automation Conference
FCCM '00 Proceedings of the 2000 IEEE Symposium on Field-Programmable Custom Computing Machines
Implementing a Simple Continuous Speech Recognition System on an FPGA
FCCM '02 Proceedings of the 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
A Scalable FPGA-Based Custom Computing Machine for a Medical Image Processing
FCCM '02 Proceedings of the 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Constant Coefficient Multiplication Using Look-Up Tables
Journal of VLSI Signal Processing Systems
Common Subexpression Elimination Involving Multiple Variables for Linear DSP Synthesis
ASAP '04 Proceedings of the Application-Specific Systems, Architectures and Processors, 15th IEEE International Conference
Closing the Gap: CPU and FPGA Trends in Sustainable Floating-Point BLAS Performance
FCCM '04 Proceedings of the 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Sparse Matrix-Vector multiplication on FPGAs
Proceedings of the 2005 ACM/SIGDA 13th international symposium on Field-programmable gate arrays
MP core: algorithm and design techniques for efficient channel estimation in wireless applications
Proceedings of the 42nd annual Design Automation Conference
On LUT Cascade Realizations of FIR Filters
DSD '05 Proceedings of the 8th Euromicro Conference on Digital System Design
Proceedings of the 2005 Asia and South Pacific Design Automation Conference
An exact algorithm for the maximal sharing of partial terms in multiple constant multiplications
ICCAD '05 Proceedings of the 2005 IEEE/ACM International conference on Computer-aided design
Digital Signal Processing with Field Programmable Gate Arrays (Signals and Communication Technology)
Digital Signal Processing with Field Programmable Gate Arrays (Signals and Communication Technology)
Multiplierless multiple constant multiplication
ACM Transactions on Algorithms (TALG)
ICASSP '01 Proceedings of the Acoustics, Speech, and Signal Processing, 200. on IEEE International Conference - Volume 02
The Counting Recursive Digital Filter
IEEE Transactions on Computers
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
An evaluation of bipartitioning techniques
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Optimizing Polynomial Expressions by Algebraic Factorization and Common Subexpression Elimination
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Hi-index | 0.00 |
We present a method for implementing high speed finite impulse response (FIR) filters on field programmable gate arrays (FPGAs). Our algorithm is a multiplierless technique where fixed coefficient multipliers are replaced with a series of add and shift operations. The first phase of our algorithm uses registered adders and hardwired shifts. Here, a modified common subexpression elimination (CSE) algorithm reduces the number of adders while maintaining performance. The second phase optimizes routing delay using prelayout wire length estimation techniques to improve the final placed and routed design. The optimization target platforms are Xilinx Virtex FPGA devices where we compare the implementation results with those produced by Xilinx Coregen, which is based on distributed arithmetic (DA). We observed up to 50&% reduction in the number of slices and up to 75% reduction in the number of look up tables (LUTs) for fully parallel implementations compared to DA method. Also, there is 50% reduction in the total dynamic power consumption of the filters. Our designs perform up to 27% faster than the multiply accumulate (MAC) filters implemented by Xilinx Coregen tool using DSP blocks. For placement, there is a saving up to 20% in number of routing channels. This results in lower congestion and up to 8% reduction in average wirelength.