Design strategies for optimal hybrid final adders in a parallel multiplier
Journal of VLSI Signal Processing Systems - Special issue on VLSI arithmetic and implementations
Optimal Circuits for Parallel Multipliers
IEEE Transactions on Computers
High-performance carry chains for FPGA's
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
A faster distributed arithmetic architecture for FPGAs
FPGA '02 Proceedings of the 2002 ACM/SIGDA tenth international symposium on Field-programmable gate arrays
Layout-aware synthesis of arithmetic circuits
Proceedings of the 39th annual Design Automation Conference
Architecture and CAD for Deep-Submicron FPGAs
Architecture and CAD for Deep-Submicron FPGAs
A 16-Bit by 16-Bit MAC Design Using Fast 5: 3 Compressor Cells
Journal of VLSI Signal Processing Systems
Instruction generation for hybrid reconfigurable systems
ACM Transactions on Design Automation of Electronic Systems (TODAES)
An FPGA architecture with enhanced datapath functionality
FPGA '03 Proceedings of the 2003 ACM/SIGDA eleventh international symposium on Field programmable gate arrays
VPR: A new packing, placement and routing tool for FPGA research
FPL '97 Proceedings of the 7th International Workshop on Field-Programmable Logic and Applications
Reconfigurable Multiplier for Virtex FPGA Family
FPL '99 Proceedings of the 9th International Workshop on Field-Programmable Logic and Applications
A hybrid ASIC and FPGA architecture
Proceedings of the 2002 IEEE/ACM international conference on Computer-aided design
Technology mapping and architecture evalution for k/m-macrocell-based FPGAs
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Improved use of the carry-save representation for the synthesis of complex arithmetic circuits
Proceedings of the 2004 IEEE/ACM International conference on Computer-aided design
Partial Product Reduction Based on Look-Up Tables
VLSID '06 Proceedings of the 19th International Conference on VLSI Design held jointly with 5th International Conference on Embedded Systems Design
Automatic synthesis of compressor trees: reevaluating large counters
Proceedings of the conference on Design, automation and test in Europe
Enhancing FPGA performance for arithmetic circuits
Proceedings of the 44th annual Design Automation Conference
A Compact High-Speed Parallel Multiplication Scheme
IEEE Transactions on Computers
IEEE Transactions on Computers
Improving XOR-Dominated Circuits by Exploiting Dependencies between Operands
ASP-DAC '07 Proceedings of the 2007 Asia and South Pacific Design Automation Conference
Design, synthesis and evaluation of heterogeneous FPGA with mixed LUTs and macro-gates
Proceedings of the 2007 IEEE/ACM international conference on Computer-aided design
Efficient synthesis of compressor trees on FPGAs
Proceedings of the 2008 Asia and South Pacific Design Automation Conference
Improving synthesis of compressor trees on FPGAs via integer linear programming
Proceedings of the conference on Design, automation and test in Europe
Measuring the Gap Between FPGAs and ASICs
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Field Programmable Compressor Trees: Acceleration of Multi-Input Addition on FPGAs
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
An FPGA Logic Cell and Carry Chain Configurable as a 6:2 or 7:2 Compressor
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Improving FPGA performance for carry-save arithmetic
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Reducing the pressure on routing resources of FPGAs with generic logic chains
Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays
Hi-index | 0.00 |
To improve FPGA performance for arithmetic circuits, this paper proposes a new architecture for FPGA logic cells that includes a 6:2 compressor. The new cell features additional fast carry-chains that concatenate adjacent compressors and can be routed locally without the global routing network. Unlike previous carry-chains for binary and ternary addition, the carry chain used by the new cell only spans 2 logic blocks, which significantly improves the delay of multi-input addition operations mapped onto the FPGA. The delay and area overhead that arises from augmenting a traditional FPGA logic cell with the new compressor structure is minimal. Using this new cell, we observed an average speedup in combinational delay of 1.41x compared to adder trees synthesized using ternary adders