Parallel 2-D Convolution on a Mesh Connected Array Processor
IEEE Transactions on Pattern Analysis and Machine Intelligence
A multiprocessor architecture for two-dimensional digital filters
IEEE Transactions on Computers
Reconfigurable pipelined 2-D convolvers for fast digital signal processing
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Automatic derivation and implementation of fast convolution algorithms
Automatic derivation and implementation of fast convolution algorithms
Local Search Method for FIR Filter Coefficients Synthesis
DELTA '04 Proceedings of the Second IEEE International Workshop on Electronic Design, Test and Applications
Design of an efficient VLSI architecture for non-linear spatial warping of wide-angle camera images
Journal of Systems Architecture: the EUROMICRO Journal
An Efficient VLSI Architecture for 2-D Convolution with Quadrant Symmetric Kernels
ISVLSI '05 Proceedings of the IEEE Computer Society Annual Symposium on VLSI: New Frontiers in VLSI Design
Computer
A pipelined architecture for real-time correction of barrel distortion in wide-angle camera images
IEEE Transactions on Circuits and Systems for Video Technology
Microprocessors & Microsystems
FPGA-based architecture for the real-time computation of 2-D convolution with large kernel size
Journal of Systems Architecture: the EUROMICRO Journal
ACSAC'07 Proceedings of the 12th Asia-Pacific conference on Advances in Computer Systems Architecture
Journal of Real-Time Image Processing
Hi-index | 0.00 |
Design of a high-performance digital architecture for computing 2-D convolution utilizing the quadrant symmetry of the kernels is proposed in this paper. Pixels in the four quadrants of the kernel region with respect to an image pixel are considered simultaneously for computing the partial results of the convolution sum. The new architecture performs computations in the logarithmic domain by utilizing novel multiplier-less log"2 and inverse-log"2 modules. An effective data-handling strategy is developed in conjunction with the logarithmic modules to eliminate the necessity of multipliers in the architecture. The systolic architecture employs parallel and pipelined processing and is able to produce one output every clock cycle. The new design resulted in approximately 40% reduction in hardware resource when compared to the approach of multiplier-based quadrant symmetric architecture. The proposed architecture design is capable of performing convolution operations for 63.3, 1024x1024 frames or 66.4 million outputs per second with 22x22 kernel in a Xilinx's Virtex 2v2000ff896-4 FPGA at maximum clock frequency of 66.4MHz. The error analysis performed in two image-processing applications of edge detection and noise filtering shows that the hardware implementation with proposed design provides accurate results similar to the software implementation.