A high-performance microarchitecture with hardware-programmable functional units
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Co-Synthesis to a Hybrid RISC/FPGA Architecture
Journal of VLSI Signal Processing Systems - Special issue on VLSI on custom computing technology
Algorithmic transformations in the implementation of K- means clustering on reconfigurable hardware
FPGA '01 Proceedings of the 2001 ACM/SIGDA ninth international symposium on Field programmable gate arrays
Evaluation of the streams-C C-to-FPGA compiler: an applications perspective
FPGA '01 Proceedings of the 2001 ACM/SIGDA ninth international symposium on Field programmable gate arrays
Garp: a MIPS processor with a reconfigurable coprocessor
FCCM '97 Proceedings of the 5th IEEE Symposium on FPGA-Based Custom Computing Machines
The NAPA Adaptive Processing Architecture
FCCM '98 Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines
A vector quantizer for image restoration
IEEE Transactions on Image Processing
A Polymorphous Computing Fabric
IEEE Micro
Polymorphous fabric-based systems: model, tools, applications
Journal of Systems Architecture: the EUROMICRO Journal - Special issue: Reconfigurable systems
Efficient K-Means VLSI Architecture for Vector Quantization
SCIA '09 Proceedings of the 16th Scandinavian Conference on Image Analysis
High speed c-means clustering in reconfigurable hardware
Microprocessors & Microsystems
VFloat: A Variable Precision Fixed- and Floating-Point Library for Reconfigurable Hardware
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
K-means clustering algorithm for multimedia applications with flexible HW/SW co-design
Journal of Systems Architecture: the EUROMICRO Journal
International Journal of Reconfigurable Computing - Special issue on Selected Papers from the 2011 International Conference on Reconfigurable Computing and FPGAs (ReConFig 2011)
Hi-index | 0.00 |
We discuss hardware/software co-processing on a hybrid processor for a compute- and data-intensive multispectral imaging algorithm, k-means clustering. The experiments are performed on two models of the Altera Excalibur board, the first using the soft IP core 32-bit NIOS 1.1 RISC processor, and the second with the hard IP core ARM processor. In our experiments, we compare performance of the sequential k-means algorithm with three different accelerated versions. We consider granularity and synchronization issues when mapping an algorithm to a hybrid processor. Our results show that speedup of 11.8X is achieved by migrating computation to the Excalibur ARM hardware/software as compared to software only on a Gigahertz Pentium III. Speedup on the Excalibur NIOS is limited by the communication cost of transferring data from external memory through the processor to the customized circuits. This limitation is overcome on the Excalibur ARM, in which dual-port memories, accessible to both the processor and configurable logic, have the biggest performance impact of all the techniques studied.