A survey of CORDIC algorithms for FPGA based computers
FPGA '98 Proceedings of the 1998 ACM/SIGDA sixth international symposium on Field programmable gate arrays
Radix-4 Vectoring CORDIC Algorithm and Architectures
Journal of VLSI Signal Processing Systems - Special issue on application specific systems, architectures and processors
PipeRench: a co/processor for streaming multimedia acceleration
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Computer Arithmetic Algorithms
Computer Arithmetic Algorithms
Architecture and CAD for Deep-Submicron FPGAs
Architecture and CAD for Deep-Submicron FPGAs
Evaluation of CORDIC Algorithms for FPGA Design
Journal of VLSI Signal Processing Systems
RaPiD - Reconfigurable Pipelined Datapath
FPL '96 Proceedings of the 6th International Workshop on Field-Programmable Logic, Smart Applications, New Paradigms and Compilers
Elementary Functions: Algorithms and Implementation
Elementary Functions: Algorithms and Implementation
A Scalable Configurable Architecture for Advanced Wireless Communication Algorithms
Journal of VLSI Signal Processing Systems
Double Precision Hybrid-Mode Floating-Point FPGA CORDIC Co-processor
HPCC '08 Proceedings of the 2008 10th IEEE International Conference on High Performance Computing and Communications
Hi-index | 0.00 |
This study compares the speed, area, and latency of shift-and-add arithmetic implemented within fine-grained FPGA resources and within a proposed coarse-grained embedded block for FPGAs. It begins by optimizing the mapping of various shift-and-add architectures within the fine-grained resources of a commercial FPGA to determine which provides the best area, delay, and latency for various word-lengths. It then proposes a new coarse-grained block that supports 16, 32, and 64-bit shift-and-add arithmetic and finally compares coarse-grained implementations to the best fine-grained implementations. Our results show that the coarse-grain implementations are between 15 and 47 times smaller and 5 to 18 times faster, depending on the implementation.