A fast CT reconstruction scheme for a general multi-core PC
Journal of Biomedical Imaging
Real-Time Optical Flow Calculations on FPGA and GPU Architectures: A Comparison Study
FCCM '08 Proceedings of the 2008 16th International Symposium on Field-Programmable Custom Computing Machines
MEDEA: a hybrid shared-memory/message-passing multiprocessor NoC-based architecture
Proceedings of the Conference on Design, Automation and Test in Europe
A fully programmable computing architecture for medical ultrasound machines
IEEE Transactions on Information Technology in Biomedicine - Special section on affective and pervasive computing for healthcare
A NoC-based hybrid message-passing/shared-memory approach to CMP design
Microprocessors & Microsystems
Dark silicon and the end of multicore scaling
Proceedings of the 38th annual international symposium on Computer architecture
Toward Dark Silicon in Servers
IEEE Micro
Multi-Core Platforms for Beamforming and Wave Field Synthesis
IEEE Transactions on Multimedia
Architecture support for accelerator-rich CMPs
Proceedings of the 49th Annual Design Automation Conference
UDSM trends comparison: from technology roadmap to UltraSparc Niagara2
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Pushing the limits for medical image reconstruction on recent standard multicore processors
International Journal of High Performance Computing Applications
Journal of Real-Time Image Processing
Hi-index | 0.00 |
An UWB microwave imaging system for breast cancer detection consists of antennas, transceivers, and a high-performance embedded system for elaborating the received signals and reconstructing breast images. In this article we focus on this embedded system. To accelerate the image reconstruction, the Beamforming phase has to be implemented in a parallel fashion. We assess its implementation in three currently available high-end platforms based on a multicore CPU, a GPU, and an FPGA, respectively. We then project the results applying technology scaling rules to future many-core CPUs, many-thread GPUs, and advanced FPGAs. We consider an optimistic case in which available resources increase according to Moore's law only, and a pessimistic case in which only a fraction of those resources are available due to a limited power budget. In both scenarios, an implementation that includes a high-end FPGA outperforms the other alternatives. Since the number of effectively usable cores in future many-cores will be power-limited, and there is a trend toward the integration of power-efficient accelerators, we conjecture that a chip consisting of a many-core section and a reconfigurable logic section will be the perfect platform for this application.