Evaluation of an accelerator architecture for speckle reducing anisotropic diffusion

Authors:
Siddharth Nilakantan;Srikanth Annangi;Nikhil Gulati;Karthik Sangaiah;Mark Hempstead
Affiliations:
Drexel University, Philadelphia, PA, USA;Drexel University, Philadelphia, PA, USA;Drexel University, Philadelphia, PA, USA;Drexel University, Philadelphia, PA, USA;Drexel University, Philadelphia, PA, USA
Venue:
CASES '11 Proceedings of the 14th international conference on Compilers, architectures and synthesis for embedded systems
Year:
2011

Citing 11
Cited 0

Battery-Driven System Design: A New Frontier in Low Power Design

ASP-DAC '02 Proceedings of the 2002 Asia and South Pacific Design Automation Conference
Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Scaling, Power and the Future of CMOS

VLSID '07 Proceedings of the 20th International Conference on VLSI Design held jointly with 6th International Conference: Embedded Systems
Tradeoffs in designing accelerator architectures for visual computing

Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Building heterogeneous reconfigurable systems with a hardware microkernel

CODES+ISSS '09 Proceedings of the 7th IEEE/ACM international conference on Hardware/software codesign and system synthesis
Rodinia: A benchmark suite for heterogeneous computing

IISWC '09 Proceedings of the 2009 IEEE International Symposium on Workload Characterization (IISWC)
Single-Chip Heterogeneous Computing: Does the Future Include Custom Logic, FPGAs, and GPGPUs?

MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Customizable Domain-Specific Computing

IEEE Design & Test
Scaling with Design Constraints: Predicting the Future of Big Chips

IEEE Micro
Speckle reducing anisotropic diffusion

IEEE Transactions on Image Processing
Understanding the Thermal Implications of Multi-Core Architectures

IEEE Transactions on Parallel and Distributed Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Increasing chip power density has brought application specific accelerator architectures to the forefront as an energy and area efficient solution. While GPGPU systems take advantage of specialized hardware to perform computationally intensive tasks faster than chip multiprocessor (CMP) systems, accelerators are hardware units that are designed to execute a specific application efficiently. Real-time ultrasound imaging applications require the removal of multiplicative noise while maintaining a steady frame-rate, and are good candidates to explore accelerator-based systems. In this paper, we propose and evaluate the architecture of an accelerator designed to improve performance of SRAD image enhancing algorithm. We compare the projected performance of the SRAD accelerator to software implementations on a multi-core CPU and a CPU+GPU system. The proposed architecture achieves higher throughput by eliminating redundant fetches from memory and by storing intermediate data locally. The speedup of the GPU is found to be 3.2x over the CPU, while the accelerator achieved a speedup of 24x. The area efficiency of the GPU and accelerator is up to 1.6x and 370x better than the CPU, respectively. In comparison with the CPU, we find that the energy consumed for operation on a single frame is found to be 1.5x lesser on the GPU and upto 580x lesser on the accelerator.